Bellabeat Case Study:

How Can a Wellness Technology Company Play it Smart?

1. Ask:

Background:

Founded in 2013 by artist Urška Sršen and mathematician Sando Mur, Bellabeat is a health and wellness technology company that seeks to inform, inspire, and empower women to make health-conscious decisions. Bellabeat manufactures smart devices that can track an individual’s activity, sleep, stress, mindfulness, and reproductive health habits. Combined with their companion Bellabeat app and subscription-based membership model, these products deliver personalized data and guidance to meet a user’s current lifestyle and desired goals. More recently in 2016, Bellabeat released multiple new products (Leaf, Time, and Spring) and expanded its’ offices around the world. Bellabeat products are available through a number of online retailers and displayed extensively in digital marketing platforms such as Google, Facebook, Instagram, Twitter, YouTube, etc.

Purpose:

The goal of this project is to find opportunities for growth in current and future markets by using smart device usage data to deliver insights that can drive strategical innovation for Bellabeat to have a greater global presence. Identifying trends in non-Bellabeat smart device usage will help Bellabeat customers and its marketing strategists. After identifying these trends, the project will study the relationship of non-Bellabeat user habits to a Bellabeat product’s tracking capabilities and then form recommendations that meet the needs of potential users. These recommendations include updates to the Bellabeat app and product design enhancements that would be most useful for driving consumer demand and retention. The final deliverables will include a clear summary of the business task, a description of all data sources used, documentation of any cleaning or manipulation of data, a summary of analysis, supporting visualizations and key findings, and recommendations based on analysis.

Business Task: Discover opportunities for growth in current and future markets by using non-Bellabeat smart device usage data to identify trends compatible with Bellabeat products and deliver insights that can drive strategical innovation for Bellabeat to have a greater global presence.

Key Stakeholders:

Urška Sršen – cofounder, artist, and Chief Creative Officer

Sando Mur – cofounder, mathematician, and executive team member

Bellabeat marketing analytics team – data analysts, marketers, and strategists

SMART Questions:

Specific – Does Bellabeat currently use smart device data to drive important business decisions? What data metrics are collected from smart devices and how is it used?

Measurable – How do activity, sleep, stress, mindfulness and reproductive habit trends correlate with health in smart devices? How do these smart device trends correlate with health in Bellabeat products like Leaf, Time, Or Spring? Which products follow trends that are most significant to measure for users

Action-oriented – What non-Bellabeat smart device data trends are important to a smart device consumer?

Relevant – How does data from non-Bellabeat users influence consumer decisions to use smart devices and more importantly, Bellabeat products?

Time-bound – How has Bellabeat incorporated smart device data in the past few years to form decisions that result in an increase of smart device consumption?

2. Prepare:

Data Set: FitBit Fitness Tracker Data – “Data set contains personal fitness tracker from thirty FitBit users.  Thirty eligible FitBit users consented to submission of personal tracker data. Minute-level output for physical activity, heart rate, and sleep monitoring. Information about daily activity, steps and heart rate can be used to explore user habits.”

Data Source: https://www.kaggle.com/datasets/arashnic/fitbit

Data Source Types: Primary Data, External Data, Continuous Data, Quantitative Data, and Structured Data.

File Type: 2 File Directories with 29 .csv files in majority long format, 2 in wide format.

File Size: 587 MB

Creator/Date: Distributed survey by Amazon Mechanical Turk between March 12, 2016, to May 12, 2016.

Expected update frequency: Annually.

Licensing: CC0: Public Domain.

Citation: Furberg, Robert; Brinton, Julia; Keating, Michael ; Ortiz, Alexa. https://zenodo.org/record/53894#.X9oeh3Uzaao

Bias or Credibility: Data is gathered from a different smart device company. Credibility is an issue with a small sample size of thirty participants and data is only from the year 2016 for select months.

Data Integrity: Made a copy of the original data set and renamed 29 comma separated value (.csv) files for easier standardization. Loaded, previewed, and performed normalization on data in RStudio. No missing values (n_missing) detected in 27 out of 29 .csv files. Exceptions include 2 out of 29 .csv files consisting of weight log info for 3.12 to 4.12 and 4.12 to 5.12; the former contains 31 NA values under “Fat” column and the latter contains 65 NA values under a similar column. There are also inconsistent distinct (n_distinct) numeric “Id” numbers across all files to represent 30 Fitbit users. In most files there are 33-34, while others contain 8, 11, 14, 23, 24 and 35 values. Possibility of under and overreporting from users. Millions (6-7 digits) of observations for minutes metric activities, exception of sleep. Ten thousand (5 digits) of observations for hourly metric activities. Thousand (4 digits) of observations for daily metric activities. Hundred (3 digits) of observations only for daily metric weight and sleep. Dataset is incomplete, does not include daily calories, daily intensities, daily steps, and sleep day files for March to April.

Setting up my environment

Setting up my R environment by loading tidyverse, here, skimr, and janitor packages.

library("tidyverse")
## Warning: package 'tidyverse' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library("here")
## Warning: package 'here' was built under R version 4.4.3
## here() starts at C:/Users/kevin
library("skimr")
## Warning: package 'skimr' was built under R version 4.4.3
library("janitor")
## Warning: package 'janitor' was built under R version 4.4.3
## 
## Attaching package: 'janitor'
## 
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

Normalization of Data

Loading, previewing and standardizing the dataset files.

daily_activity_mar_apr <- read_csv("dailyactivity_3_to_4.csv")
## Rows: 457 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityDate
## dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
heart_rate_mar_apr <- read_csv("heartrate_seconds_3_to_4.csv")
## Rows: 1154681 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Time
## dbl (2): Id, Value
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
hourly_calories_mar_apr <- read_csv("hourlycalories_3_to_4.csv")
## Rows: 24084 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityHour
## dbl (2): Id, Calories
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
hourly_intensities_mar_apr <- read_csv("hourlyintensities_3_to_4.csv")
## Rows: 24084 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityHour
## dbl (3): Id, TotalIntensity, AverageIntensity
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
hourly_steps_mar_apr <- read_csv("hourlysteps_3_to_4.csv")
## Rows: 24084 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityHour
## dbl (2): Id, StepTotal
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_calories_mar_apr <- read_csv("minutecaloriesnarrow_3_to_4.csv")
## Rows: 1445040 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, Calories
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_intensities_mar_apr <-read_csv("minuteintensitiesnarrow_3_to_4.csv")
## Rows: 1445040 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, Intensity
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_mets_mar_apr <- read_csv("minutemetsnarrow_3_to_4.csv")
## Rows: 1445040 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, METs
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_sleep_mar_apr <- read_csv("minutesleep_3_to_4.csv")
## Rows: 198559 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): date
## dbl (3): Id, value, logId
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_steps_mar_apr <- read_csv("minutestepsnarrow_3_to_4.csv")
## Rows: 1445040 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, Steps
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
weight_log_mar_apr <- read_csv("weightloginfo_3_to_4.csv")
## Rows: 33 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Date
## dbl (6): Id, WeightKg, WeightPounds, Fat, BMI, LogId
## lgl (1): IsManualReport
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
daily_activity_apr_may <- read_csv("dailyactivity_4_to_5.csv")
## Rows: 940 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityDate
## dbl (14): Id, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDi...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
daily_calories_apr_may <- read_csv("dailycalories_4_to_5.csv")
## Rows: 940 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityDay
## dbl (2): Id, Calories
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
daily_intensities_apr_may <- read_csv("dailyintensities_4_to_5.csv")
## Rows: 940 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityDay
## dbl (9): Id, SedentaryMinutes, LightlyActiveMinutes, FairlyActiveMinutes, Ve...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
daily_steps_apr_may <- read_csv("dailysteps_4_to_5.csv")
## Rows: 940 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityDay
## dbl (2): Id, StepTotal
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
heart_rate_apr_may <- read_csv("heartrate_seconds_4_to_5.csv")
## Rows: 2483658 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Time
## dbl (2): Id, Value
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
hourly_calories_apr_may <- read_csv("hourlycalories_4_to_5.csv")
## Rows: 22099 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityHour
## dbl (2): Id, Calories
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
hourly_intensities_apr_may <- read_csv("hourlyintensities_4_to_5.csv")
## Rows: 22099 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityHour
## dbl (3): Id, TotalIntensity, AverageIntensity
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
hourly_steps_apr_may <- read_csv("hourlysteps_4_to_5.csv")
## Rows: 22099 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityHour
## dbl (2): Id, StepTotal
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_calories_apr_may <- read_csv("minutecaloriesnarrow_4_to_5.csv")
## Rows: 1325580 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, Calories
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_calories_apr_may_2 <- read_csv("minutecalorieswide_4_to_5.csv")
## Rows: 21645 Columns: 62
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityHour
## dbl (61): Id, Calories00, Calories01, Calories02, Calories03, Calories04, Ca...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_intensities_apr_may <- read_csv("minuteintensitiesnarrow_4_to_5.csv")
## Rows: 1325580 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, Intensity
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_intensities_apr_may_2 <- read_csv("minuteintensitieswide_4_to_5.csv")
## Rows: 21645 Columns: 62
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityHour
## dbl (61): Id, Intensity00, Intensity01, Intensity02, Intensity03, Intensity0...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_mets_apr_may <- read_csv("minutemetsnarrow_4_to_5.csv")
## Rows: 1325580 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, METs
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_sleep_apr_may <- read_csv("minutesleep_4_to_5.csv")
## Rows: 188521 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): date
## dbl (3): Id, value, logId
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_steps_apr_may <- read_csv("minutestepsnarrow_4_to_5.csv")
## Rows: 1325580 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): ActivityMinute
## dbl (2): Id, Steps
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
minute_steps_apr_may_2 <- read_csv("minutestepswide_4_to_5.csv")
## Rows: 21645 Columns: 62
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): ActivityHour
## dbl (61): Id, Steps00, Steps01, Steps02, Steps03, Steps04, Steps05, Steps06,...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
sleep_day_apr_may <- read_csv("sleepday_4_to_5.csv")
## Rows: 413 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): SleepDay
## dbl (4): Id, TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
weight_log_apr_may <- read_csv("weightloginfo_4_to_5.csv")
## Rows: 67 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Date
## dbl (6): Id, WeightKg, WeightPounds, Fat, BMI, LogId
## lgl (1): IsManualReport
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
skim_without_charts(daily_activity_mar_apr)
Data summary
Name daily_activity_mar_apr
Number of rows 457
Number of columns 15
_______________________
Column type frequency:
character 1
numeric 14
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityDate 0 1 8 9 0 32 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.628595e+09 2.293781e+09 1503960366 2.347168e+09 4.057193e+09 6.391747e+09 8.877689e+09
TotalSteps 0 1 6.546560e+03 5.398490e+03 0 1.988000e+03 5.986000e+03 1.019800e+04 2.849700e+04
TotalDistance 0 1 4.660000e+00 4.080000e+00 0 1.410000e+00 4.090000e+00 7.160000e+00 2.753000e+01
TrackerDistance 0 1 4.610000e+00 4.070000e+00 0 1.280000e+00 4.090000e+00 7.110000e+00 2.753000e+01
LoggedActivitiesDistance 0 1 1.800000e-01 8.500000e-01 0 0.000000e+00 0.000000e+00 0.000000e+00 6.730000e+00
VeryActiveDistance 0 1 1.180000e+00 2.490000e+00 0 0.000000e+00 0.000000e+00 1.310000e+00 2.192000e+01
ModeratelyActiveDistance 0 1 4.800000e-01 8.300000e-01 0 0.000000e+00 2.000000e-02 6.700000e-01 6.400000e+00
LightActiveDistance 0 1 2.890000e+00 2.240000e+00 0 8.700000e-01 2.930000e+00 4.460000e+00 1.251000e+01
SedentaryActiveDistance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e-01
VeryActiveMinutes 0 1 1.662000e+01 2.892000e+01 0 0.000000e+00 0.000000e+00 2.500000e+01 2.020000e+02
FairlyActiveMinutes 0 1 1.307000e+01 3.621000e+01 0 0.000000e+00 1.000000e+00 1.600000e+01 6.600000e+02
LightlyActiveMinutes 0 1 1.700700e+02 1.222100e+02 0 6.400000e+01 1.810000e+02 2.570000e+02 7.200000e+02
SedentaryMinutes 0 1 9.952800e+02 3.370200e+02 32 7.280000e+02 1.057000e+03 1.285000e+03 1.440000e+03
Calories 0 1 2.189450e+03 8.154800e+02 0 1.776000e+03 2.062000e+03 2.667000e+03 4.562000e+03
head(daily_activity_mar_apr)
## # A tibble: 6 × 15
##           Id ActivityDate TotalSteps TotalDistance TrackerDistance
##        <dbl> <chr>             <dbl>         <dbl>           <dbl>
## 1 1503960366 3/25/2016         11004          7.11            7.11
## 2 1503960366 3/26/2016         17609         11.6            11.6 
## 3 1503960366 3/27/2016         12736          8.53            8.53
## 4 1503960366 3/28/2016         13231          8.93            8.93
## 5 1503960366 3/29/2016         12041          7.85            7.85
## 6 1503960366 3/30/2016         10970          7.16            7.16
## # ℹ 10 more variables: LoggedActivitiesDistance <dbl>,
## #   VeryActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, SedentaryActiveDistance <dbl>,
## #   VeryActiveMinutes <dbl>, FairlyActiveMinutes <dbl>,
## #   LightlyActiveMinutes <dbl>, SedentaryMinutes <dbl>, Calories <dbl>
skim_without_charts(heart_rate_mar_apr)
Data summary
Name heart_rate_mar_apr
Number of rows 1154681
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Time 0 1 19 21 0 510597 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 5.352122e+09 2.033584e+09 2022484408 4020332650 5553957443 6962181067 8877689391
Value 0 1 7.976000e+01 1.873000e+01 36 66 77 90 185
head(heart_rate_mar_apr)
## # A tibble: 6 × 3
##           Id Time                Value
##        <dbl> <chr>               <dbl>
## 1 2022484408 4/1/2016 7:54:00 AM    93
## 2 2022484408 4/1/2016 7:54:05 AM    91
## 3 2022484408 4/1/2016 7:54:10 AM    96
## 4 2022484408 4/1/2016 7:54:15 AM    98
## 5 2022484408 4/1/2016 7:54:20 AM   100
## 6 2022484408 4/1/2016 7:54:25 AM   101
skim_without_charts(hourly_calories_mar_apr)
Data summary
Name hourly_calories_mar_apr
Number of rows 24084
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 755 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2421565819.2 1503960366 2347167796 4558609924 6962181067 8877689391
Calories 0 1 9.427000e+01 59.4 42 61 77 104 933
head(hourly_calories_mar_apr)
## # A tibble: 6 × 3
##           Id ActivityHour          Calories
##        <dbl> <chr>                    <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM       48
## 2 1503960366 3/12/2016 1:00:00 AM        48
## 3 1503960366 3/12/2016 2:00:00 AM        48
## 4 1503960366 3/12/2016 3:00:00 AM        48
## 5 1503960366 3/12/2016 4:00:00 AM        48
## 6 1503960366 3/12/2016 5:00:00 AM        48
skim_without_charts(hourly_intensities_mar_apr)
Data summary
Name hourly_intensities_mar_ap…
Number of rows 24084
Number of columns 4
_______________________
Column type frequency:
character 1
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 755 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2.421566e+09 1503960366 2347167796 4.55861e+09 6.962181e+09 8877689391
TotalIntensity 0 1 1.083000e+01 2.031000e+01 0 0 1.00000e+00 1.400000e+01 180
AverageIntensity 0 1 1.800000e-01 3.400000e-01 0 0 2.00000e-02 2.300000e-01 3
head(hourly_intensities_mar_apr)
## # A tibble: 6 × 4
##           Id ActivityHour          TotalIntensity AverageIntensity
##        <dbl> <chr>                          <dbl>            <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM              0                0
## 2 1503960366 3/12/2016 1:00:00 AM               0                0
## 3 1503960366 3/12/2016 2:00:00 AM               0                0
## 4 1503960366 3/12/2016 3:00:00 AM               0                0
## 5 1503960366 3/12/2016 4:00:00 AM               0                0
## 6 1503960366 3/12/2016 5:00:00 AM               0                0
skim_without_charts(hourly_steps_mar_apr)
Data summary
Name hourly_steps_mar_apr
Number of rows 24084
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 755 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2.421566e+09 1503960366 2347167796 4558609924 6962181067 8877689391
StepTotal 0 1 2.862200e+02 6.649200e+02 0 0 10 289 10565
head(hourly_steps_mar_apr)
## # A tibble: 6 × 3
##           Id ActivityHour          StepTotal
##        <dbl> <chr>                     <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM         0
## 2 1503960366 3/12/2016 1:00:00 AM          0
## 3 1503960366 3/12/2016 2:00:00 AM          0
## 4 1503960366 3/12/2016 3:00:00 AM          0
## 5 1503960366 3/12/2016 4:00:00 AM          0
## 6 1503960366 3/12/2016 5:00:00 AM          0
skim_without_charts(minute_calories_mar_apr)
Data summary
Name minute_calories_mar_apr
Number of rows 1445040
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 45300 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2.421516e+09 1503960366 2.347168e+09 4.55861e+09 6.962181e+09 8.877689e+09
Calories 0 1 1.570000e+00 1.360000e+00 0 9.400000e-01 1.22000e+00 1.410000e+00 2.301000e+01
head(minute_calories_mar_apr)
## # A tibble: 6 × 3
##           Id ActivityMinute        Calories
##        <dbl> <chr>                    <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM    0.797
## 2 1503960366 3/12/2016 12:01:00 AM    0.797
## 3 1503960366 3/12/2016 12:02:00 AM    0.797
## 4 1503960366 3/12/2016 12:03:00 AM    0.797
## 5 1503960366 3/12/2016 12:04:00 AM    0.797
## 6 1503960366 3/12/2016 12:05:00 AM    0.797
skim_without_charts(minute_intensities_mar_apr)
Data summary
Name minute_intensities_mar_ap…
Number of rows 1445040
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 45300 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2.421516e+09 1503960366 2347167796 4558609924 6962181067 8877689391
Intensity 0 1 1.800000e-01 4.900000e-01 0 0 0 0 3
head(minute_intensities_mar_apr)
## # A tibble: 6 × 3
##           Id ActivityMinute        Intensity
##        <dbl> <chr>                     <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM         0
## 2 1503960366 3/12/2016 12:01:00 AM         0
## 3 1503960366 3/12/2016 12:02:00 AM         0
## 4 1503960366 3/12/2016 12:03:00 AM         0
## 5 1503960366 3/12/2016 12:04:00 AM         0
## 6 1503960366 3/12/2016 12:05:00 AM         0
skim_without_charts(minute_mets_mar_apr)
Data summary
Name minute_mets_mar_apr
Number of rows 1445040
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 45300 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2.421516e+09 1503960366 2347167796 4558609924 6962181067 8877689391
METs 0 1 1.424000e+01 1.154000e+01 0 10 10 11 189
head(minute_mets_mar_apr)
## # A tibble: 6 × 3
##           Id ActivityMinute         METs
##        <dbl> <chr>                 <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM    10
## 2 1503960366 3/12/2016 12:01:00 AM    10
## 3 1503960366 3/12/2016 12:02:00 AM    10
## 4 1503960366 3/12/2016 12:03:00 AM    10
## 5 1503960366 3/12/2016 12:04:00 AM    10
## 6 1503960366 3/12/2016 12:05:00 AM    10
skim_without_charts(minute_sleep_mar_apr)
Data summary
Name minute_sleep_mar_apr
Number of rows 198559
Number of columns 4
_______________________
Column type frequency:
character 1
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
date 0 1 19 21 0 54523 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.824304e+09 2.173935e+09 1503960366 2347167796 4702921684 6775888955 8792009665
value 0 1 1.090000e+00 3.100000e-01 1 1 1 1 3
logId 0 1 1.124161e+10 7.969858e+07 11103653021 11165512026 11243951252 11310735495 11374876178
head(minute_sleep_mar_apr)
## # A tibble: 6 × 4
##           Id date                 value       logId
##        <dbl> <chr>                <dbl>       <dbl>
## 1 1503960366 3/13/2016 2:39:30 AM     1 11114919637
## 2 1503960366 3/13/2016 2:40:30 AM     1 11114919637
## 3 1503960366 3/13/2016 2:41:30 AM     1 11114919637
## 4 1503960366 3/13/2016 2:42:30 AM     1 11114919637
## 5 1503960366 3/13/2016 2:43:30 AM     1 11114919637
## 6 1503960366 3/13/2016 2:44:30 AM     1 11114919637
skim_without_charts(minute_steps_mar_apr)
Data summary
Name minute_steps_mar_apr
Number of rows 1445040
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 45300 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.889424e+09 2.421516e+09 1503960366 2347167796 4558609924 6962181067 8877689391
Steps 0 1 4.770000e+00 1.722000e+01 0 0 0 0 204
head(minute_steps_mar_apr)
## # A tibble: 6 × 3
##           Id ActivityMinute        Steps
##        <dbl> <chr>                 <dbl>
## 1 1503960366 3/12/2016 12:00:00 AM     0
## 2 1503960366 3/12/2016 12:01:00 AM     0
## 3 1503960366 3/12/2016 12:02:00 AM     0
## 4 1503960366 3/12/2016 12:03:00 AM     0
## 5 1503960366 3/12/2016 12:04:00 AM     0
## 6 1503960366 3/12/2016 12:05:00 AM     0
skim_without_charts(weight_log_mar_apr)
Data summary
Name weight_log_mar_apr
Number of rows 33
Number of columns 8
_______________________
Column type frequency:
character 1
logical 1
numeric 6
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Date 0 1 19 21 0 24 0

Variable type: logical

skim_variable n_missing complete_rate mean count
IsManualReport 0 1 0.7 TRU: 23, FAL: 10

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1.00 6.477156e+09 2.308888e+09 1.503960e+09 4.702922e+09 6.962181e+09 8.877689e+09 8.877689e+09
WeightKg 0 1.00 7.344000e+01 1.653000e+01 5.330000e+01 6.170000e+01 6.250000e+01 8.580000e+01 1.296000e+02
WeightPounds 0 1.00 1.619100e+02 3.644000e+01 1.175100e+02 1.360300e+02 1.377900e+02 1.891600e+02 2.857200e+02
Fat 31 0.06 1.600000e+01 8.490000e+00 1.000000e+01 1.300000e+01 1.600000e+01 1.900000e+01 2.200000e+01
BMI 0 1.00 2.573000e+01 4.330000e+00 2.145000e+01 2.410000e+01 2.439000e+01 2.576000e+01 4.617000e+01
LogId 0 1.00 1.459959e+12 3.088072e+08 1.459382e+12 1.459753e+12 1.459987e+12 1.460160e+12 1.460506e+12
head(weight_log_mar_apr)
## # A tibble: 6 × 8
##           Id Date       WeightKg WeightPounds   Fat   BMI IsManualReport   LogId
##        <dbl> <chr>         <dbl>        <dbl> <dbl> <dbl> <lgl>            <dbl>
## 1 1503960366 4/5/2016 …     53.3         118.    22  23.0 TRUE           1.46e12
## 2 1927972279 4/10/2016…    130.          286.    NA  46.2 FALSE          1.46e12
## 3 2347167796 4/3/2016 …     63.4         140.    10  24.8 TRUE           1.46e12
## 4 2873212765 4/6/2016 …     56.7         125.    NA  21.5 TRUE           1.46e12
## 5 2873212765 4/7/2016 …     57.2         126.    NA  21.6 TRUE           1.46e12
## 6 2891001357 4/5/2016 …     88.4         195.    NA  25.0 TRUE           1.46e12
skim_without_charts(daily_activity_apr_may)
Data summary
Name daily_activity_apr_may
Number of rows 940
Number of columns 15
_______________________
Column type frequency:
character 1
numeric 14
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityDate 0 1 8 9 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.855407e+09 2.424805e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
TotalSteps 0 1 7.637910e+03 5.087150e+03 0 3.789750e+03 7.405500e+03 1.072700e+04 3.601900e+04
TotalDistance 0 1 5.490000e+00 3.920000e+00 0 2.620000e+00 5.240000e+00 7.710000e+00 2.803000e+01
TrackerDistance 0 1 5.480000e+00 3.910000e+00 0 2.620000e+00 5.240000e+00 7.710000e+00 2.803000e+01
LoggedActivitiesDistance 0 1 1.100000e-01 6.200000e-01 0 0.000000e+00 0.000000e+00 0.000000e+00 4.940000e+00
VeryActiveDistance 0 1 1.500000e+00 2.660000e+00 0 0.000000e+00 2.100000e-01 2.050000e+00 2.192000e+01
ModeratelyActiveDistance 0 1 5.700000e-01 8.800000e-01 0 0.000000e+00 2.400000e-01 8.000000e-01 6.480000e+00
LightActiveDistance 0 1 3.340000e+00 2.040000e+00 0 1.950000e+00 3.360000e+00 4.780000e+00 1.071000e+01
SedentaryActiveDistance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.100000e-01
VeryActiveMinutes 0 1 2.116000e+01 3.284000e+01 0 0.000000e+00 4.000000e+00 3.200000e+01 2.100000e+02
FairlyActiveMinutes 0 1 1.356000e+01 1.999000e+01 0 0.000000e+00 6.000000e+00 1.900000e+01 1.430000e+02
LightlyActiveMinutes 0 1 1.928100e+02 1.091700e+02 0 1.270000e+02 1.990000e+02 2.640000e+02 5.180000e+02
SedentaryMinutes 0 1 9.912100e+02 3.012700e+02 0 7.297500e+02 1.057500e+03 1.229500e+03 1.440000e+03
Calories 0 1 2.303610e+03 7.181700e+02 0 1.828500e+03 2.134000e+03 2.793250e+03 4.900000e+03
head(daily_activity_apr_may)
## # A tibble: 6 × 15
##           Id ActivityDate TotalSteps TotalDistance TrackerDistance
##        <dbl> <chr>             <dbl>         <dbl>           <dbl>
## 1 1503960366 4/12/2016         13162          8.5             8.5 
## 2 1503960366 4/13/2016         10735          6.97            6.97
## 3 1503960366 4/14/2016         10460          6.74            6.74
## 4 1503960366 4/15/2016          9762          6.28            6.28
## 5 1503960366 4/16/2016         12669          8.16            8.16
## 6 1503960366 4/17/2016          9705          6.48            6.48
## # ℹ 10 more variables: LoggedActivitiesDistance <dbl>,
## #   VeryActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, SedentaryActiveDistance <dbl>,
## #   VeryActiveMinutes <dbl>, FairlyActiveMinutes <dbl>,
## #   LightlyActiveMinutes <dbl>, SedentaryMinutes <dbl>, Calories <dbl>
skim_without_charts(daily_calories_apr_may)
Data summary
Name daily_calories_apr_may
Number of rows 940
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityDay 0 1 8 9 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.855407e+09 2.424805e+09 1503960366 2320127002.0 4445114986 6.962181e+09 8877689391
Calories 0 1 2.303610e+03 7.181700e+02 0 1828.5 2134 2.793250e+03 4900
head(daily_calories_apr_may)
## # A tibble: 6 × 3
##           Id ActivityDay Calories
##        <dbl> <chr>          <dbl>
## 1 1503960366 4/12/2016       1985
## 2 1503960366 4/13/2016       1797
## 3 1503960366 4/14/2016       1776
## 4 1503960366 4/15/2016       1745
## 5 1503960366 4/16/2016       1863
## 6 1503960366 4/17/2016       1728
skim_without_charts(daily_intensities_apr_may)
Data summary
Name daily_intensities_apr_may
Number of rows 940
Number of columns 10
_______________________
Column type frequency:
character 1
numeric 9
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityDay 0 1 8 9 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.855407e+09 2.424805e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
SedentaryMinutes 0 1 9.912100e+02 3.012700e+02 0 7.297500e+02 1.057500e+03 1.229500e+03 1.440000e+03
LightlyActiveMinutes 0 1 1.928100e+02 1.091700e+02 0 1.270000e+02 1.990000e+02 2.640000e+02 5.180000e+02
FairlyActiveMinutes 0 1 1.356000e+01 1.999000e+01 0 0.000000e+00 6.000000e+00 1.900000e+01 1.430000e+02
VeryActiveMinutes 0 1 2.116000e+01 3.284000e+01 0 0.000000e+00 4.000000e+00 3.200000e+01 2.100000e+02
SedentaryActiveDistance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.100000e-01
LightActiveDistance 0 1 3.340000e+00 2.040000e+00 0 1.950000e+00 3.360000e+00 4.780000e+00 1.071000e+01
ModeratelyActiveDistance 0 1 5.700000e-01 8.800000e-01 0 0.000000e+00 2.400000e-01 8.000000e-01 6.480000e+00
VeryActiveDistance 0 1 1.500000e+00 2.660000e+00 0 0.000000e+00 2.100000e-01 2.050000e+00 2.192000e+01
head(daily_intensities_apr_may)
## # A tibble: 6 × 10
##         Id ActivityDay SedentaryMinutes LightlyActiveMinutes FairlyActiveMinutes
##      <dbl> <chr>                  <dbl>                <dbl>               <dbl>
## 1   1.50e9 4/12/2016                728                  328                  13
## 2   1.50e9 4/13/2016                776                  217                  19
## 3   1.50e9 4/14/2016               1218                  181                  11
## 4   1.50e9 4/15/2016                726                  209                  34
## 5   1.50e9 4/16/2016                773                  221                  10
## 6   1.50e9 4/17/2016                539                  164                  20
## # ℹ 5 more variables: VeryActiveMinutes <dbl>, SedentaryActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   VeryActiveDistance <dbl>
skim_without_charts(daily_steps_apr_may)
Data summary
Name daily_steps_apr_may
Number of rows 940
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityDay 0 1 8 9 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.855407e+09 2.424805e+09 1503960366 2.320127e+09 4445114986.0 6962181067 8877689391
StepTotal 0 1 7.637910e+03 5.087150e+03 0 3.789750e+03 7405.5 10727 36019
head(daily_steps_apr_may)
## # A tibble: 6 × 3
##           Id ActivityDay StepTotal
##        <dbl> <chr>           <dbl>
## 1 1503960366 4/12/2016       13162
## 2 1503960366 4/13/2016       10735
## 3 1503960366 4/14/2016       10460
## 4 1503960366 4/15/2016        9762
## 5 1503960366 4/16/2016       12669
## 6 1503960366 4/17/2016        9705
skim_without_charts(heart_rate_apr_may)
Data summary
Name heart_rate_apr_may
Number of rows 2483658
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Time 0 1 19 21 0 961274 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 5.513765e+09 1950223761.0 2022484408 4388161847 5553957443 6962181067 8877689391
Value 0 1 7.733000e+01 19.4 36 63 73 88 203
head(heart_rate_apr_may)
## # A tibble: 6 × 3
##           Id Time                 Value
##        <dbl> <chr>                <dbl>
## 1 2022484408 4/12/2016 7:21:00 AM    97
## 2 2022484408 4/12/2016 7:21:05 AM   102
## 3 2022484408 4/12/2016 7:21:10 AM   105
## 4 2022484408 4/12/2016 7:21:20 AM   103
## 5 2022484408 4/12/2016 7:21:25 AM   101
## 6 2022484408 4/12/2016 7:22:05 AM    95
skim_without_charts(hourly_calories_apr_may)
Data summary
Name hourly_calories_apr_may
Number of rows 22099
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 736 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.848235e+09 2.4225e+09 1503960366 2320127002 4445114986 6962181067 8877689391
Calories 0 1 9.739000e+01 6.0700e+01 42 63 83 108 948
head(hourly_calories_apr_may)
## # A tibble: 6 × 3
##           Id ActivityHour          Calories
##        <dbl> <chr>                    <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM       81
## 2 1503960366 4/12/2016 1:00:00 AM        61
## 3 1503960366 4/12/2016 2:00:00 AM        59
## 4 1503960366 4/12/2016 3:00:00 AM        47
## 5 1503960366 4/12/2016 4:00:00 AM        48
## 6 1503960366 4/12/2016 5:00:00 AM        48
skim_without_charts(hourly_intensities_apr_may)
Data summary
Name hourly_intensities_apr_ma…
Number of rows 22099
Number of columns 4
_______________________
Column type frequency:
character 1
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 736 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.848235e+09 2.4225e+09 1503960366 2320127002 4.445115e+09 6.962181e+09 8877689391
TotalIntensity 0 1 1.204000e+01 2.1130e+01 0 0 3.000000e+00 1.600000e+01 180
AverageIntensity 0 1 2.000000e-01 3.5000e-01 0 0 5.000000e-02 2.700000e-01 3
head(hourly_intensities_apr_may)
## # A tibble: 6 × 4
##           Id ActivityHour          TotalIntensity AverageIntensity
##        <dbl> <chr>                          <dbl>            <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM             20            0.333
## 2 1503960366 4/12/2016 1:00:00 AM               8            0.133
## 3 1503960366 4/12/2016 2:00:00 AM               7            0.117
## 4 1503960366 4/12/2016 3:00:00 AM               0            0    
## 5 1503960366 4/12/2016 4:00:00 AM               0            0    
## 6 1503960366 4/12/2016 5:00:00 AM               0            0
skim_without_charts(hourly_steps_apr_may)
Data summary
Name hourly_steps_apr_may
Number of rows 22099
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 736 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.848235e+09 2.4225e+09 1503960366 2320127002 4445114986 6962181067 8877689391
StepTotal 0 1 3.201700e+02 6.9038e+02 0 0 40 357 10554
head(hourly_steps_apr_may)
## # A tibble: 6 × 3
##           Id ActivityHour          StepTotal
##        <dbl> <chr>                     <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM       373
## 2 1503960366 4/12/2016 1:00:00 AM        160
## 3 1503960366 4/12/2016 2:00:00 AM        151
## 4 1503960366 4/12/2016 3:00:00 AM          0
## 5 1503960366 4/12/2016 4:00:00 AM          0
## 6 1503960366 4/12/2016 5:00:00 AM          0
skim_without_charts(minute_calories_apr_may)
Data summary
Name minute_calories_apr_may
Number of rows 1325580
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 44160 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.847898e+09 2.422313e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
Calories 0 1 1.620000e+00 1.410000e+00 0 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
head(minute_calories_apr_may)
## # A tibble: 6 × 3
##           Id ActivityMinute        Calories
##        <dbl> <chr>                    <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM    0.786
## 2 1503960366 4/12/2016 12:01:00 AM    0.786
## 3 1503960366 4/12/2016 12:02:00 AM    0.786
## 4 1503960366 4/12/2016 12:03:00 AM    0.786
## 5 1503960366 4/12/2016 12:04:00 AM    0.786
## 6 1503960366 4/12/2016 12:05:00 AM    0.944
skim_without_charts(minute_calories_apr_may_2)
Data summary
Name minute_calories_apr_may_2
Number of rows 21645
Number of columns 62
_______________________
Column type frequency:
character 1
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 729 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.836965e+09 2.424088e+09 1.50396e+09 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
Calories00 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories01 0 1 1.630000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories02 0 1 1.640000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories03 0 1 1.640000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories04 0 1 1.640000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories05 0 1 1.640000e+00 1.440000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories06 0 1 1.640000e+00 1.440000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories07 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories08 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories09 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.676000e+01
Calories10 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.744000e+01
Calories11 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.676000e+01
Calories12 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.744000e+01
Calories13 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.668000e+01
Calories14 0 1 1.610000e+00 1.400000e+00 0.00000e+00 9.400000e-01 1.220000e+00 1.430000e+00 1.693000e+01
Calories15 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.719000e+01
Calories16 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.719000e+01
Calories17 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.744000e+01
Calories18 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.693000e+01
Calories19 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.668000e+01
Calories20 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.630000e+01
Calories21 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.683000e+01
Calories22 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.778000e+01
Calories23 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.778000e+01
Calories24 0 1 1.610000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.735000e+01
Calories25 0 1 1.620000e+00 1.420000e+00 0.00000e+00 9.400000e-01 1.220000e+00 1.430000e+00 1.709000e+01
Calories26 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.699000e+01
Calories27 0 1 1.620000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.723000e+01
Calories28 0 1 1.620000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.683000e+01
Calories29 0 1 1.620000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.735000e+01
Calories30 0 1 1.620000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.735000e+01
Calories31 0 1 1.630000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.761000e+01
Calories32 0 1 1.630000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.761000e+01
Calories33 0 1 1.640000e+00 1.440000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.761000e+01
Calories34 0 1 1.630000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.787000e+01
Calories35 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.787000e+01
Calories36 0 1 1.640000e+00 1.460000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories37 0 1 1.640000e+00 1.450000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories38 0 1 1.630000e+00 1.450000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories39 0 1 1.630000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories40 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories41 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories42 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories43 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories44 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories45 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories46 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories47 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories48 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories49 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories50 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories51 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories52 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories53 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories54 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories55 0 1 1.620000e+00 1.390000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
Calories56 0 1 1.610000e+00 1.380000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories57 0 1 1.610000e+00 1.370000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories58 0 1 1.610000e+00 1.370000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
Calories59 0 1 1.610000e+00 1.370000e+00 0.00000e+00 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
head(minute_calories_apr_may_2)
## # A tibble: 6 × 62
##           Id ActivityHour Calories00 Calories01 Calories02 Calories03 Calories04
##        <dbl> <chr>             <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
## 1 1503960366 4/13/2016 1…      1.89       2.20       0.944      0.944      0.944
## 2 1503960366 4/13/2016 1…      0.786      0.786      0.786      0.786      0.944
## 3 1503960366 4/13/2016 2…      0.786      0.786      0.786      0.786      0.786
## 4 1503960366 4/13/2016 3…      0.786      0.786      0.786      0.786      0.786
## 5 1503960366 4/13/2016 4…      0.786      0.786      0.786      0.786      0.786
## 6 1503960366 4/13/2016 5…      0.786      0.786      0.786      0.786      0.786
## # ℹ 55 more variables: Calories05 <dbl>, Calories06 <dbl>, Calories07 <dbl>,
## #   Calories08 <dbl>, Calories09 <dbl>, Calories10 <dbl>, Calories11 <dbl>,
## #   Calories12 <dbl>, Calories13 <dbl>, Calories14 <dbl>, Calories15 <dbl>,
## #   Calories16 <dbl>, Calories17 <dbl>, Calories18 <dbl>, Calories19 <dbl>,
## #   Calories20 <dbl>, Calories21 <dbl>, Calories22 <dbl>, Calories23 <dbl>,
## #   Calories24 <dbl>, Calories25 <dbl>, Calories26 <dbl>, Calories27 <dbl>,
## #   Calories28 <dbl>, Calories29 <dbl>, Calories30 <dbl>, Calories31 <dbl>, …
skim_without_charts(minute_intensities_apr_may)
Data summary
Name minute_intensities_apr_ma…
Number of rows 1325580
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 44160 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4847897691.9 2.422313e+09 1503960366 2320127002 4445114986 6962181067 8877689391
Intensity 0 1 0.2 5.200000e-01 0 0 0 0 3
head(minute_intensities_apr_may)
## # A tibble: 6 × 3
##           Id ActivityMinute        Intensity
##        <dbl> <chr>                     <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM         0
## 2 1503960366 4/12/2016 12:01:00 AM         0
## 3 1503960366 4/12/2016 12:02:00 AM         0
## 4 1503960366 4/12/2016 12:03:00 AM         0
## 5 1503960366 4/12/2016 12:04:00 AM         0
## 6 1503960366 4/12/2016 12:05:00 AM         0
skim_without_charts(minute_intensities_apr_may_2)
Data summary
Name minute_intensities_apr_ma…
Number of rows 21645
Number of columns 62
_______________________
Column type frequency:
character 1
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 729 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.836965e+09 2.424088e+09 1503960366 2320127002 4445114986 6962181067 8877689391
Intensity00 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity01 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity02 0 1 2.100000e-01 5.200000e-01 0 0 0 0 3
Intensity03 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity04 0 1 2.100000e-01 5.200000e-01 0 0 0 0 3
Intensity05 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity06 0 1 2.100000e-01 5.200000e-01 0 0 0 0 3
Intensity07 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity08 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity09 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity10 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity11 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity12 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity13 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity14 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity15 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity16 0 1 1.900000e-01 5.200000e-01 0 0 0 0 3
Intensity17 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity18 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity19 0 1 2.000000e-01 5.300000e-01 0 0 0 0 3
Intensity20 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity21 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity22 0 1 2.000000e-01 5.300000e-01 0 0 0 0 3
Intensity23 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity24 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity25 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity26 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity27 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity28 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity29 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity30 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity31 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity32 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
Intensity33 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
Intensity34 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity35 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
Intensity36 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
Intensity37 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
Intensity38 0 1 2.000000e-01 5.300000e-01 0 0 0 0 3
Intensity39 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity40 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity41 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity42 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity43 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity44 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity45 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity46 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity47 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity48 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity49 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
Intensity50 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity51 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity52 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity53 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity54 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity55 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity56 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity57 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity58 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
Intensity59 0 1 2.000000e-01 5.000000e-01 0 0 0 0 3
head(minute_intensities_apr_may_2)
## # A tibble: 6 × 62
##           Id ActivityHour        Intensity00 Intensity01 Intensity02 Intensity03
##        <dbl> <chr>                     <dbl>       <dbl>       <dbl>       <dbl>
## 1 1503960366 4/13/2016 12:00:00…           1           1           0           0
## 2 1503960366 4/13/2016 1:00:00 …           0           0           0           0
## 3 1503960366 4/13/2016 2:00:00 …           0           0           0           0
## 4 1503960366 4/13/2016 3:00:00 …           0           0           0           0
## 5 1503960366 4/13/2016 4:00:00 …           0           0           0           0
## 6 1503960366 4/13/2016 5:00:00 …           0           0           0           0
## # ℹ 56 more variables: Intensity04 <dbl>, Intensity05 <dbl>, Intensity06 <dbl>,
## #   Intensity07 <dbl>, Intensity08 <dbl>, Intensity09 <dbl>, Intensity10 <dbl>,
## #   Intensity11 <dbl>, Intensity12 <dbl>, Intensity13 <dbl>, Intensity14 <dbl>,
## #   Intensity15 <dbl>, Intensity16 <dbl>, Intensity17 <dbl>, Intensity18 <dbl>,
## #   Intensity19 <dbl>, Intensity20 <dbl>, Intensity21 <dbl>, Intensity22 <dbl>,
## #   Intensity23 <dbl>, Intensity24 <dbl>, Intensity25 <dbl>, Intensity26 <dbl>,
## #   Intensity27 <dbl>, Intensity28 <dbl>, Intensity29 <dbl>, …
skim_without_charts(minute_mets_apr_may)
Data summary
Name minute_mets_apr_may
Number of rows 1325580
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 44160 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.847898e+09 2.422313e+09 1503960366 2320127002 4445114986 6962181067 8877689391
METs 0 1 1.469000e+01 1.206000e+01 0 10 10 11 157
head(minute_mets_apr_may)
## # A tibble: 6 × 3
##           Id ActivityMinute         METs
##        <dbl> <chr>                 <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM    10
## 2 1503960366 4/12/2016 12:01:00 AM    10
## 3 1503960366 4/12/2016 12:02:00 AM    10
## 4 1503960366 4/12/2016 12:03:00 AM    10
## 5 1503960366 4/12/2016 12:04:00 AM    10
## 6 1503960366 4/12/2016 12:05:00 AM    12
skim_without_charts(minute_sleep_apr_may)
Data summary
Name minute_sleep_apr_may
Number of rows 188521
Number of columns 4
_______________________
Column type frequency:
character 1
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
date 0 1 19 21 0 49773 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.996595e+09 2.066950e+09 1503960366 3977333714 4702921684 6962181067 8792009665
value 0 1 1.100000e+00 3.300000e-01 1 1 1 1 3
logId 0 1 1.149611e+10 6.822863e+07 11372227280 11439308639 11501142214 11552534115 11616251768
head(minute_sleep_apr_may)
## # A tibble: 6 × 4
##           Id date                 value       logId
##        <dbl> <chr>                <dbl>       <dbl>
## 1 1503960366 4/12/2016 2:47:30 AM     3 11380564589
## 2 1503960366 4/12/2016 2:48:30 AM     2 11380564589
## 3 1503960366 4/12/2016 2:49:30 AM     1 11380564589
## 4 1503960366 4/12/2016 2:50:30 AM     1 11380564589
## 5 1503960366 4/12/2016 2:51:30 AM     1 11380564589
## 6 1503960366 4/12/2016 2:52:30 AM     1 11380564589
skim_without_charts(minute_steps_apr_may)
Data summary
Name minute_steps_apr_may
Number of rows 1325580
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityMinute 0 1 19 21 0 44160 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.847898e+09 2.422313e+09 1503960366 2320127002 4445114986 6962181067 8877689391
Steps 0 1 5.340000e+00 1.813000e+01 0 0 0 0 220
head(minute_steps_apr_may)
## # A tibble: 6 × 3
##           Id ActivityMinute        Steps
##        <dbl> <chr>                 <dbl>
## 1 1503960366 4/12/2016 12:00:00 AM     0
## 2 1503960366 4/12/2016 12:01:00 AM     0
## 3 1503960366 4/12/2016 12:02:00 AM     0
## 4 1503960366 4/12/2016 12:03:00 AM     0
## 5 1503960366 4/12/2016 12:04:00 AM     0
## 6 1503960366 4/12/2016 12:05:00 AM     0
skim_without_charts(minute_steps_apr_may_2)
Data summary
Name minute_steps_apr_may_2
Number of rows 21645
Number of columns 62
_______________________
Column type frequency:
character 1
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ActivityHour 0 1 19 21 0 729 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 4.836965e+09 2.424088e+09 1503960366 2320127002 4445114986 6962181067 8877689391
Steps00 0 1 5.300000e+00 1.778000e+01 0 0 0 0 186
Steps01 0 1 5.340000e+00 1.768000e+01 0 0 0 0 180
Steps02 0 1 5.530000e+00 1.808000e+01 0 0 0 0 182
Steps03 0 1 5.470000e+00 1.811000e+01 0 0 0 0 182
Steps04 0 1 5.460000e+00 1.829000e+01 0 0 0 0 181
Steps05 0 1 5.590000e+00 1.857000e+01 0 0 0 0 180
Steps06 0 1 5.560000e+00 1.848000e+01 0 0 0 0 181
Steps07 0 1 5.410000e+00 1.834000e+01 0 0 0 0 183
Steps08 0 1 5.360000e+00 1.821000e+01 0 0 0 0 180
Steps09 0 1 5.360000e+00 1.819000e+01 0 0 0 0 183
Steps10 0 1 5.340000e+00 1.834000e+01 0 0 0 0 180
Steps11 0 1 5.290000e+00 1.818000e+01 0 0 0 0 181
Steps12 0 1 5.300000e+00 1.830000e+01 0 0 0 0 181
Steps13 0 1 5.260000e+00 1.835000e+01 0 0 0 0 180
Steps14 0 1 5.340000e+00 1.840000e+01 0 0 0 0 182
Steps15 0 1 5.280000e+00 1.829000e+01 0 0 0 0 179
Steps16 0 1 5.210000e+00 1.815000e+01 0 0 0 0 180
Steps17 0 1 5.290000e+00 1.822000e+01 0 0 0 0 183
Steps18 0 1 5.350000e+00 1.830000e+01 0 0 0 0 180
Steps19 0 1 5.420000e+00 1.849000e+01 0 0 0 0 182
Steps20 0 1 5.300000e+00 1.844000e+01 0 0 0 0 179
Steps21 0 1 5.290000e+00 1.837000e+01 0 0 0 0 185
Steps22 0 1 5.530000e+00 1.871000e+01 0 0 0 0 182
Steps23 0 1 5.350000e+00 1.839000e+01 0 0 0 0 187
Steps24 0 1 5.310000e+00 1.827000e+01 0 0 0 0 180
Steps25 0 1 5.300000e+00 1.830000e+01 0 0 0 0 181
Steps26 0 1 5.250000e+00 1.816000e+01 0 0 0 0 186
Steps27 0 1 5.310000e+00 1.822000e+01 0 0 0 0 180
Steps28 0 1 5.270000e+00 1.802000e+01 0 0 0 0 181
Steps29 0 1 5.260000e+00 1.802000e+01 0 0 0 0 183
Steps30 0 1 5.400000e+00 1.832000e+01 0 0 0 0 181
Steps31 0 1 5.360000e+00 1.812000e+01 0 0 0 0 181
Steps32 0 1 5.440000e+00 1.820000e+01 0 0 0 0 181
Steps33 0 1 5.500000e+00 1.840000e+01 0 0 0 0 182
Steps34 0 1 5.470000e+00 1.832000e+01 0 0 0 0 180
Steps35 0 1 5.420000e+00 1.819000e+01 0 0 0 0 187
Steps36 0 1 5.580000e+00 1.870000e+01 0 0 0 0 183
Steps37 0 1 5.500000e+00 1.850000e+01 0 0 0 0 181
Steps38 0 1 5.480000e+00 1.850000e+01 0 0 0 0 185
Steps39 0 1 5.340000e+00 1.806000e+01 0 0 0 0 184
Steps40 0 1 5.380000e+00 1.803000e+01 0 0 0 0 184
Steps41 0 1 5.340000e+00 1.806000e+01 0 0 0 0 184
Steps42 0 1 5.260000e+00 1.802000e+01 0 0 0 0 180
Steps43 0 1 5.290000e+00 1.784000e+01 0 0 0 0 188
Steps44 0 1 5.350000e+00 1.799000e+01 0 0 0 0 220
Steps45 0 1 5.240000e+00 1.786000e+01 0 0 0 0 184
Steps46 0 1 5.340000e+00 1.809000e+01 0 0 0 0 207
Steps47 0 1 5.300000e+00 1.794000e+01 0 0 0 0 190
Steps48 0 1 5.320000e+00 1.780000e+01 0 0 0 0 182
Steps49 0 1 5.350000e+00 1.795000e+01 0 0 0 0 182
Steps50 0 1 5.330000e+00 1.787000e+01 0 0 0 0 182
Steps51 0 1 5.190000e+00 1.760000e+01 0 0 0 0 181
Steps52 0 1 5.230000e+00 1.762000e+01 0 0 0 0 181
Steps53 0 1 5.150000e+00 1.757000e+01 0 0 0 0 181
Steps54 0 1 5.220000e+00 1.768000e+01 0 0 0 0 184
Steps55 0 1 5.280000e+00 1.783000e+01 0 0 0 0 181
Steps56 0 1 5.180000e+00 1.757000e+01 0 0 0 0 182
Steps57 0 1 5.250000e+00 1.769000e+01 0 0 0 0 182
Steps58 0 1 5.140000e+00 1.743000e+01 0 0 0 0 180
Steps59 0 1 5.290000e+00 1.772000e+01 0 0 0 0 189
head(minute_steps_apr_may_2)
## # A tibble: 6 × 62
##          Id ActivityHour Steps00 Steps01 Steps02 Steps03 Steps04 Steps05 Steps06
##       <dbl> <chr>          <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
## 1    1.50e9 4/13/2016 1…       4      16       0       0       0       9       0
## 2    1.50e9 4/13/2016 1…       0       0       0       0       0       0       0
## 3    1.50e9 4/13/2016 2…       0       0       0       0       0       0       0
## 4    1.50e9 4/13/2016 3…       0       0       0       0       0       0       0
## 5    1.50e9 4/13/2016 4…       0       0       0       0       0       0       0
## 6    1.50e9 4/13/2016 5…       0       0       0       0       0       0       0
## # ℹ 53 more variables: Steps07 <dbl>, Steps08 <dbl>, Steps09 <dbl>,
## #   Steps10 <dbl>, Steps11 <dbl>, Steps12 <dbl>, Steps13 <dbl>, Steps14 <dbl>,
## #   Steps15 <dbl>, Steps16 <dbl>, Steps17 <dbl>, Steps18 <dbl>, Steps19 <dbl>,
## #   Steps20 <dbl>, Steps21 <dbl>, Steps22 <dbl>, Steps23 <dbl>, Steps24 <dbl>,
## #   Steps25 <dbl>, Steps26 <dbl>, Steps27 <dbl>, Steps28 <dbl>, Steps29 <dbl>,
## #   Steps30 <dbl>, Steps31 <dbl>, Steps32 <dbl>, Steps33 <dbl>, Steps34 <dbl>,
## #   Steps35 <dbl>, Steps36 <dbl>, Steps37 <dbl>, Steps38 <dbl>, …
skim_without_charts(sleep_day_apr_may)
Data summary
Name sleep_day_apr_may
Number of rows 413
Number of columns 5
_______________________
Column type frequency:
character 1
numeric 4
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
SleepDay 0 1 20 21 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1 5.000979e+09 2.06036e+09 1503960366 3977333714 4702921684 6962181067 8792009665
TotalSleepRecords 0 1 1.120000e+00 3.50000e-01 1 1 1 1 3
TotalMinutesAsleep 0 1 4.194700e+02 1.18340e+02 58 361 433 490 796
TotalTimeInBed 0 1 4.586400e+02 1.27100e+02 61 403 463 526 961
head(sleep_day_apr_may)
## # A tibble: 6 × 5
##           Id SleepDay        TotalSleepRecords TotalMinutesAsleep TotalTimeInBed
##        <dbl> <chr>                       <dbl>              <dbl>          <dbl>
## 1 1503960366 4/12/2016 12:0…                 1                327            346
## 2 1503960366 4/13/2016 12:0…                 2                384            407
## 3 1503960366 4/15/2016 12:0…                 1                412            442
## 4 1503960366 4/16/2016 12:0…                 2                340            367
## 5 1503960366 4/17/2016 12:0…                 1                700            712
## 6 1503960366 4/19/2016 12:0…                 1                304            320
skim_without_charts(weight_log_apr_may)
Data summary
Name weight_log_apr_may
Number of rows 67
Number of columns 8
_______________________
Column type frequency:
character 1
logical 1
numeric 6
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Date 0 1 19 21 0 56 0

Variable type: logical

skim_variable n_missing complete_rate mean count
IsManualReport 0 1 0.61 TRU: 41, FAL: 26

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
Id 0 1.00 7.009282e+09 1.950322e+09 1.503960e+09 6.962181e+09 6.962181e+09 8.877689e+09 8.877689e+09
WeightKg 0 1.00 7.204000e+01 1.392000e+01 5.260000e+01 6.140000e+01 6.250000e+01 8.505000e+01 1.335000e+02
WeightPounds 0 1.00 1.588100e+02 3.070000e+01 1.159600e+02 1.353600e+02 1.377900e+02 1.875000e+02 2.943200e+02
Fat 65 0.03 2.350000e+01 2.120000e+00 2.200000e+01 2.275000e+01 2.350000e+01 2.425000e+01 2.500000e+01
BMI 0 1.00 2.519000e+01 3.070000e+00 2.145000e+01 2.396000e+01 2.439000e+01 2.556000e+01 4.754000e+01
LogId 0 1.00 1.461772e+12 7.829948e+08 1.460444e+12 1.461079e+12 1.461802e+12 1.462375e+12 1.463098e+12
head(weight_log_apr_may)
## # A tibble: 6 × 8
##           Id Date       WeightKg WeightPounds   Fat   BMI IsManualReport   LogId
##        <dbl> <chr>         <dbl>        <dbl> <dbl> <dbl> <lgl>            <dbl>
## 1 1503960366 5/2/2016 …     52.6         116.    22  22.6 TRUE           1.46e12
## 2 1503960366 5/3/2016 …     52.6         116.    NA  22.6 TRUE           1.46e12
## 3 1927972279 4/13/2016…    134.          294.    NA  47.5 FALSE          1.46e12
## 4 2873212765 4/21/2016…     56.7         125.    NA  21.5 TRUE           1.46e12
## 5 2873212765 5/12/2016…     57.3         126.    NA  21.7 TRUE           1.46e12
## 6 4319703577 4/17/2016…     72.4         160.    25  27.5 TRUE           1.46e12
colnames(daily_activity_mar_apr)
##  [1] "Id"                       "ActivityDate"            
##  [3] "TotalSteps"               "TotalDistance"           
##  [5] "TrackerDistance"          "LoggedActivitiesDistance"
##  [7] "VeryActiveDistance"       "ModeratelyActiveDistance"
##  [9] "LightActiveDistance"      "SedentaryActiveDistance" 
## [11] "VeryActiveMinutes"        "FairlyActiveMinutes"     
## [13] "LightlyActiveMinutes"     "SedentaryMinutes"        
## [15] "Calories"
colnames(heart_rate_mar_apr)
## [1] "Id"    "Time"  "Value"
colnames(hourly_calories_mar_apr)
## [1] "Id"           "ActivityHour" "Calories"
colnames(hourly_intensities_mar_apr)
## [1] "Id"               "ActivityHour"     "TotalIntensity"   "AverageIntensity"
colnames(hourly_steps_mar_apr)
## [1] "Id"           "ActivityHour" "StepTotal"
colnames(minute_calories_mar_apr)
## [1] "Id"             "ActivityMinute" "Calories"
colnames(minute_intensities_mar_apr)
## [1] "Id"             "ActivityMinute" "Intensity"
colnames(minute_mets_mar_apr)
## [1] "Id"             "ActivityMinute" "METs"
colnames(minute_sleep_mar_apr)
## [1] "Id"    "date"  "value" "logId"
colnames(minute_steps_mar_apr)
## [1] "Id"             "ActivityMinute" "Steps"
colnames(weight_log_mar_apr)
## [1] "Id"             "Date"           "WeightKg"       "WeightPounds"  
## [5] "Fat"            "BMI"            "IsManualReport" "LogId"
colnames(daily_activity_apr_may)
##  [1] "Id"                       "ActivityDate"            
##  [3] "TotalSteps"               "TotalDistance"           
##  [5] "TrackerDistance"          "LoggedActivitiesDistance"
##  [7] "VeryActiveDistance"       "ModeratelyActiveDistance"
##  [9] "LightActiveDistance"      "SedentaryActiveDistance" 
## [11] "VeryActiveMinutes"        "FairlyActiveMinutes"     
## [13] "LightlyActiveMinutes"     "SedentaryMinutes"        
## [15] "Calories"
colnames(daily_calories_apr_may)
## [1] "Id"          "ActivityDay" "Calories"
colnames(daily_intensities_apr_may)
##  [1] "Id"                       "ActivityDay"             
##  [3] "SedentaryMinutes"         "LightlyActiveMinutes"    
##  [5] "FairlyActiveMinutes"      "VeryActiveMinutes"       
##  [7] "SedentaryActiveDistance"  "LightActiveDistance"     
##  [9] "ModeratelyActiveDistance" "VeryActiveDistance"
colnames(daily_steps_apr_may)
## [1] "Id"          "ActivityDay" "StepTotal"
colnames(heart_rate_apr_may)
## [1] "Id"    "Time"  "Value"
colnames(hourly_calories_apr_may)
## [1] "Id"           "ActivityHour" "Calories"
colnames(hourly_intensities_apr_may)
## [1] "Id"               "ActivityHour"     "TotalIntensity"   "AverageIntensity"
colnames(hourly_steps_apr_may)
## [1] "Id"           "ActivityHour" "StepTotal"
colnames(minute_calories_apr_may)
## [1] "Id"             "ActivityMinute" "Calories"
colnames(minute_calories_apr_may_2)
##  [1] "Id"           "ActivityHour" "Calories00"   "Calories01"   "Calories02"  
##  [6] "Calories03"   "Calories04"   "Calories05"   "Calories06"   "Calories07"  
## [11] "Calories08"   "Calories09"   "Calories10"   "Calories11"   "Calories12"  
## [16] "Calories13"   "Calories14"   "Calories15"   "Calories16"   "Calories17"  
## [21] "Calories18"   "Calories19"   "Calories20"   "Calories21"   "Calories22"  
## [26] "Calories23"   "Calories24"   "Calories25"   "Calories26"   "Calories27"  
## [31] "Calories28"   "Calories29"   "Calories30"   "Calories31"   "Calories32"  
## [36] "Calories33"   "Calories34"   "Calories35"   "Calories36"   "Calories37"  
## [41] "Calories38"   "Calories39"   "Calories40"   "Calories41"   "Calories42"  
## [46] "Calories43"   "Calories44"   "Calories45"   "Calories46"   "Calories47"  
## [51] "Calories48"   "Calories49"   "Calories50"   "Calories51"   "Calories52"  
## [56] "Calories53"   "Calories54"   "Calories55"   "Calories56"   "Calories57"  
## [61] "Calories58"   "Calories59"
colnames(minute_intensities_apr_may)
## [1] "Id"             "ActivityMinute" "Intensity"
colnames(minute_intensities_apr_may_2)
##  [1] "Id"           "ActivityHour" "Intensity00"  "Intensity01"  "Intensity02" 
##  [6] "Intensity03"  "Intensity04"  "Intensity05"  "Intensity06"  "Intensity07" 
## [11] "Intensity08"  "Intensity09"  "Intensity10"  "Intensity11"  "Intensity12" 
## [16] "Intensity13"  "Intensity14"  "Intensity15"  "Intensity16"  "Intensity17" 
## [21] "Intensity18"  "Intensity19"  "Intensity20"  "Intensity21"  "Intensity22" 
## [26] "Intensity23"  "Intensity24"  "Intensity25"  "Intensity26"  "Intensity27" 
## [31] "Intensity28"  "Intensity29"  "Intensity30"  "Intensity31"  "Intensity32" 
## [36] "Intensity33"  "Intensity34"  "Intensity35"  "Intensity36"  "Intensity37" 
## [41] "Intensity38"  "Intensity39"  "Intensity40"  "Intensity41"  "Intensity42" 
## [46] "Intensity43"  "Intensity44"  "Intensity45"  "Intensity46"  "Intensity47" 
## [51] "Intensity48"  "Intensity49"  "Intensity50"  "Intensity51"  "Intensity52" 
## [56] "Intensity53"  "Intensity54"  "Intensity55"  "Intensity56"  "Intensity57" 
## [61] "Intensity58"  "Intensity59"
colnames(minute_mets_apr_may)
## [1] "Id"             "ActivityMinute" "METs"
colnames(minute_sleep_apr_may)
## [1] "Id"    "date"  "value" "logId"
colnames(minute_steps_apr_may)
## [1] "Id"             "ActivityMinute" "Steps"
colnames(minute_steps_apr_may_2)
##  [1] "Id"           "ActivityHour" "Steps00"      "Steps01"      "Steps02"     
##  [6] "Steps03"      "Steps04"      "Steps05"      "Steps06"      "Steps07"     
## [11] "Steps08"      "Steps09"      "Steps10"      "Steps11"      "Steps12"     
## [16] "Steps13"      "Steps14"      "Steps15"      "Steps16"      "Steps17"     
## [21] "Steps18"      "Steps19"      "Steps20"      "Steps21"      "Steps22"     
## [26] "Steps23"      "Steps24"      "Steps25"      "Steps26"      "Steps27"     
## [31] "Steps28"      "Steps29"      "Steps30"      "Steps31"      "Steps32"     
## [36] "Steps33"      "Steps34"      "Steps35"      "Steps36"      "Steps37"     
## [41] "Steps38"      "Steps39"      "Steps40"      "Steps41"      "Steps42"     
## [46] "Steps43"      "Steps44"      "Steps45"      "Steps46"      "Steps47"     
## [51] "Steps48"      "Steps49"      "Steps50"      "Steps51"      "Steps52"     
## [56] "Steps53"      "Steps54"      "Steps55"      "Steps56"      "Steps57"     
## [61] "Steps58"      "Steps59"
colnames(sleep_day_apr_may)
## [1] "Id"                 "SleepDay"           "TotalSleepRecords" 
## [4] "TotalMinutesAsleep" "TotalTimeInBed"
colnames(weight_log_apr_may)
## [1] "Id"             "Date"           "WeightKg"       "WeightPounds"  
## [5] "Fat"            "BMI"            "IsManualReport" "LogId"
daily_activity_id <- n_distinct(daily_activity_mar_apr$Id)

heart_rate_id <- n_distinct(heart_rate_mar_apr$Id)

hourly_calories_id <- n_distinct(hourly_calories_mar_apr$Id)

hourly_intensities_id <- n_distinct(hourly_intensities_mar_apr$Id)

hourly_steps_id <- n_distinct(hourly_steps_mar_apr$Id)

minute_calories_id <- n_distinct(minute_calories_mar_apr$Id)

minute_intensities_id <- n_distinct(minute_intensities_mar_apr$Id)

minute_mets_id <- n_distinct(minute_mets_mar_apr$Id)

minute_sleep_id <- n_distinct(minute_sleep_mar_apr$Id)

minute_steps_id <- n_distinct(minute_steps_mar_apr$Id)

weight_log_id <- n_distinct(weight_log_mar_apr$Id)

daily_activity_id2 <- n_distinct(daily_activity_apr_may$Id)

daily_calories_id2 <- n_distinct(daily_calories_apr_may$Id)

daily_intensities_id2 <- n_distinct(daily_intensities_apr_may$Id)

daily_steps_id2 <- n_distinct(daily_steps_apr_may$Id)

heart_rate_id2 <- n_distinct(heart_rate_apr_may$Id)

hourly_calories_id2 <- n_distinct(hourly_calories_apr_may$Id)

hourly_intensities_id2 <- n_distinct(hourly_intensities_apr_may$Id)

hourly_steps_id2 <- n_distinct(hourly_steps_apr_may$Id)

minute_calories_id2 <- n_distinct(minute_calories_apr_may$Id)

minute_calories_id3 <- n_distinct(minute_calories_apr_may_2$Id)

minute_intensities_id2 <- n_distinct(minute_intensities_apr_may$Id)

minute_intensities_id3 <- n_distinct(minute_intensities_apr_may_2$Id)

minute_mets_id2 <- n_distinct(minute_mets_apr_may$Id)

minute_sleep_id2 <- n_distinct(minute_sleep_apr_may$Id)

minute_steps_id2 <- n_distinct(minute_steps_apr_may$Id)

minute_steps_id3 <- n_distinct(minute_steps_apr_may_2$Id)

sleep_day_id2 <- n_distinct(sleep_day_apr_may$Id)

weight_log_id2 <- n_distinct(weight_log_apr_may$Id)
daily_activity_mar_apr %>% nrow()
## [1] 457
heart_rate_mar_apr %>% nrow()
## [1] 1154681
hourly_calories_mar_apr %>% nrow()
## [1] 24084
hourly_intensities_mar_apr%>% nrow()
## [1] 24084
hourly_steps_mar_apr %>% nrow()
## [1] 24084
minute_calories_mar_apr %>% nrow()
## [1] 1445040
minute_intensities_mar_apr %>% nrow()
## [1] 1445040
minute_mets_mar_apr %>% nrow()
## [1] 1445040
minute_sleep_mar_apr %>% nrow()
## [1] 198559
minute_steps_mar_apr %>% nrow()
## [1] 1445040
weight_log_mar_apr %>% nrow()
## [1] 33
daily_activity_apr_may %>% nrow()
## [1] 940
daily_calories_apr_may %>% nrow()
## [1] 940
daily_intensities_apr_may %>% nrow()
## [1] 940
daily_steps_apr_may %>% nrow()
## [1] 940
heart_rate_apr_may %>% nrow()
## [1] 2483658
hourly_calories_apr_may %>% nrow()
## [1] 22099
hourly_intensities_apr_may %>% nrow()
## [1] 22099
hourly_steps_apr_may %>% nrow()
## [1] 22099
minute_calories_apr_may %>% nrow()
## [1] 1325580
minute_calories_apr_may_2 %>% nrow()
## [1] 21645
minute_intensities_apr_may %>% nrow()
## [1] 1325580
minute_intensities_apr_may_2 %>% nrow()
## [1] 21645
minute_mets_apr_may %>% nrow()
## [1] 1325580
minute_sleep_apr_may %>% nrow()
## [1] 188521
minute_steps_apr_may %>% nrow()
## [1] 1325580
minute_steps_apr_may_2 %>% nrow()
## [1] 21645
sleep_day_apr_may %>% nrow()
## [1] 413
weight_log_apr_may %>% nrow()
## [1] 67

3. Process:

Setting up my environment

Setting up my R environment by loading tidyverse, here, skimr and janitor packages.

Processing of Data

Cleaning and transformation of the dataset.

distinct(daily_activity_mar_apr)
## # A tibble: 457 × 15
##            Id ActivityDate TotalSteps TotalDistance TrackerDistance
##         <dbl> <chr>             <dbl>         <dbl>           <dbl>
##  1 1503960366 3/25/2016         11004          7.11            7.11
##  2 1503960366 3/26/2016         17609         11.6            11.6 
##  3 1503960366 3/27/2016         12736          8.53            8.53
##  4 1503960366 3/28/2016         13231          8.93            8.93
##  5 1503960366 3/29/2016         12041          7.85            7.85
##  6 1503960366 3/30/2016         10970          7.16            7.16
##  7 1503960366 3/31/2016         12256          7.86            7.86
##  8 1503960366 4/1/2016          12262          7.87            7.87
##  9 1503960366 4/2/2016          11248          7.25            7.25
## 10 1503960366 4/3/2016          10016          6.37            6.37
## # ℹ 447 more rows
## # ℹ 10 more variables: LoggedActivitiesDistance <dbl>,
## #   VeryActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, SedentaryActiveDistance <dbl>,
## #   VeryActiveMinutes <dbl>, FairlyActiveMinutes <dbl>,
## #   LightlyActiveMinutes <dbl>, SedentaryMinutes <dbl>, Calories <dbl>
distinct(daily_activity_apr_may)
## # A tibble: 940 × 15
##            Id ActivityDate TotalSteps TotalDistance TrackerDistance
##         <dbl> <chr>             <dbl>         <dbl>           <dbl>
##  1 1503960366 4/12/2016         13162          8.5             8.5 
##  2 1503960366 4/13/2016         10735          6.97            6.97
##  3 1503960366 4/14/2016         10460          6.74            6.74
##  4 1503960366 4/15/2016          9762          6.28            6.28
##  5 1503960366 4/16/2016         12669          8.16            8.16
##  6 1503960366 4/17/2016          9705          6.48            6.48
##  7 1503960366 4/18/2016         13019          8.59            8.59
##  8 1503960366 4/19/2016         15506          9.88            9.88
##  9 1503960366 4/20/2016         10544          6.68            6.68
## 10 1503960366 4/21/2016          9819          6.34            6.34
## # ℹ 930 more rows
## # ℹ 10 more variables: LoggedActivitiesDistance <dbl>,
## #   VeryActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, SedentaryActiveDistance <dbl>,
## #   VeryActiveMinutes <dbl>, FairlyActiveMinutes <dbl>,
## #   LightlyActiveMinutes <dbl>, SedentaryMinutes <dbl>, Calories <dbl>
distinct(daily_calories_apr_may)
## # A tibble: 940 × 3
##            Id ActivityDay Calories
##         <dbl> <chr>          <dbl>
##  1 1503960366 4/12/2016       1985
##  2 1503960366 4/13/2016       1797
##  3 1503960366 4/14/2016       1776
##  4 1503960366 4/15/2016       1745
##  5 1503960366 4/16/2016       1863
##  6 1503960366 4/17/2016       1728
##  7 1503960366 4/18/2016       1921
##  8 1503960366 4/19/2016       2035
##  9 1503960366 4/20/2016       1786
## 10 1503960366 4/21/2016       1775
## # ℹ 930 more rows
distinct(daily_intensities_apr_may)
## # A tibble: 940 × 10
##         Id ActivityDay SedentaryMinutes LightlyActiveMinutes FairlyActiveMinutes
##      <dbl> <chr>                  <dbl>                <dbl>               <dbl>
##  1  1.50e9 4/12/2016                728                  328                  13
##  2  1.50e9 4/13/2016                776                  217                  19
##  3  1.50e9 4/14/2016               1218                  181                  11
##  4  1.50e9 4/15/2016                726                  209                  34
##  5  1.50e9 4/16/2016                773                  221                  10
##  6  1.50e9 4/17/2016                539                  164                  20
##  7  1.50e9 4/18/2016               1149                  233                  16
##  8  1.50e9 4/19/2016                775                  264                  31
##  9  1.50e9 4/20/2016                818                  205                  12
## 10  1.50e9 4/21/2016                838                  211                   8
## # ℹ 930 more rows
## # ℹ 5 more variables: VeryActiveMinutes <dbl>, SedentaryActiveDistance <dbl>,
## #   LightActiveDistance <dbl>, ModeratelyActiveDistance <dbl>,
## #   VeryActiveDistance <dbl>
distinct(daily_steps_apr_may)
## # A tibble: 940 × 3
##            Id ActivityDay StepTotal
##         <dbl> <chr>           <dbl>
##  1 1503960366 4/12/2016       13162
##  2 1503960366 4/13/2016       10735
##  3 1503960366 4/14/2016       10460
##  4 1503960366 4/15/2016        9762
##  5 1503960366 4/16/2016       12669
##  6 1503960366 4/17/2016        9705
##  7 1503960366 4/18/2016       13019
##  8 1503960366 4/19/2016       15506
##  9 1503960366 4/20/2016       10544
## 10 1503960366 4/21/2016        9819
## # ℹ 930 more rows
distinct(heart_rate_mar_apr)
## # A tibble: 1,154,681 × 3
##            Id Time                Value
##         <dbl> <chr>               <dbl>
##  1 2022484408 4/1/2016 7:54:00 AM    93
##  2 2022484408 4/1/2016 7:54:05 AM    91
##  3 2022484408 4/1/2016 7:54:10 AM    96
##  4 2022484408 4/1/2016 7:54:15 AM    98
##  5 2022484408 4/1/2016 7:54:20 AM   100
##  6 2022484408 4/1/2016 7:54:25 AM   101
##  7 2022484408 4/1/2016 7:54:30 AM   104
##  8 2022484408 4/1/2016 7:54:35 AM   105
##  9 2022484408 4/1/2016 7:54:45 AM   102
## 10 2022484408 4/1/2016 7:54:55 AM   106
## # ℹ 1,154,671 more rows
distinct(heart_rate_apr_may)
## # A tibble: 2,483,658 × 3
##            Id Time                 Value
##         <dbl> <chr>                <dbl>
##  1 2022484408 4/12/2016 7:21:00 AM    97
##  2 2022484408 4/12/2016 7:21:05 AM   102
##  3 2022484408 4/12/2016 7:21:10 AM   105
##  4 2022484408 4/12/2016 7:21:20 AM   103
##  5 2022484408 4/12/2016 7:21:25 AM   101
##  6 2022484408 4/12/2016 7:22:05 AM    95
##  7 2022484408 4/12/2016 7:22:10 AM    91
##  8 2022484408 4/12/2016 7:22:15 AM    93
##  9 2022484408 4/12/2016 7:22:20 AM    94
## 10 2022484408 4/12/2016 7:22:25 AM    93
## # ℹ 2,483,648 more rows
distinct(hourly_calories_mar_apr)
## # A tibble: 24,084 × 3
##            Id ActivityHour          Calories
##         <dbl> <chr>                    <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM       48
##  2 1503960366 3/12/2016 1:00:00 AM        48
##  3 1503960366 3/12/2016 2:00:00 AM        48
##  4 1503960366 3/12/2016 3:00:00 AM        48
##  5 1503960366 3/12/2016 4:00:00 AM        48
##  6 1503960366 3/12/2016 5:00:00 AM        48
##  7 1503960366 3/12/2016 6:00:00 AM        48
##  8 1503960366 3/12/2016 7:00:00 AM        48
##  9 1503960366 3/12/2016 8:00:00 AM        48
## 10 1503960366 3/12/2016 9:00:00 AM        49
## # ℹ 24,074 more rows
distinct(hourly_calories_apr_may)
## # A tibble: 22,099 × 3
##            Id ActivityHour          Calories
##         <dbl> <chr>                    <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM       81
##  2 1503960366 4/12/2016 1:00:00 AM        61
##  3 1503960366 4/12/2016 2:00:00 AM        59
##  4 1503960366 4/12/2016 3:00:00 AM        47
##  5 1503960366 4/12/2016 4:00:00 AM        48
##  6 1503960366 4/12/2016 5:00:00 AM        48
##  7 1503960366 4/12/2016 6:00:00 AM        48
##  8 1503960366 4/12/2016 7:00:00 AM        47
##  9 1503960366 4/12/2016 8:00:00 AM        68
## 10 1503960366 4/12/2016 9:00:00 AM       141
## # ℹ 22,089 more rows
distinct(hourly_intensities_mar_apr)
## # A tibble: 24,084 × 4
##            Id ActivityHour          TotalIntensity AverageIntensity
##         <dbl> <chr>                          <dbl>            <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM              0           0     
##  2 1503960366 3/12/2016 1:00:00 AM               0           0     
##  3 1503960366 3/12/2016 2:00:00 AM               0           0     
##  4 1503960366 3/12/2016 3:00:00 AM               0           0     
##  5 1503960366 3/12/2016 4:00:00 AM               0           0     
##  6 1503960366 3/12/2016 5:00:00 AM               0           0     
##  7 1503960366 3/12/2016 6:00:00 AM               0           0     
##  8 1503960366 3/12/2016 7:00:00 AM               0           0     
##  9 1503960366 3/12/2016 8:00:00 AM               0           0     
## 10 1503960366 3/12/2016 9:00:00 AM               1           0.0167
## # ℹ 24,074 more rows
distinct(hourly_intensities_apr_may)
## # A tibble: 22,099 × 4
##            Id ActivityHour          TotalIntensity AverageIntensity
##         <dbl> <chr>                          <dbl>            <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM             20            0.333
##  2 1503960366 4/12/2016 1:00:00 AM               8            0.133
##  3 1503960366 4/12/2016 2:00:00 AM               7            0.117
##  4 1503960366 4/12/2016 3:00:00 AM               0            0    
##  5 1503960366 4/12/2016 4:00:00 AM               0            0    
##  6 1503960366 4/12/2016 5:00:00 AM               0            0    
##  7 1503960366 4/12/2016 6:00:00 AM               0            0    
##  8 1503960366 4/12/2016 7:00:00 AM               0            0    
##  9 1503960366 4/12/2016 8:00:00 AM              13            0.217
## 10 1503960366 4/12/2016 9:00:00 AM              30            0.5  
## # ℹ 22,089 more rows
distinct(hourly_steps_mar_apr)
## # A tibble: 24,084 × 3
##            Id ActivityHour          StepTotal
##         <dbl> <chr>                     <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM         0
##  2 1503960366 3/12/2016 1:00:00 AM          0
##  3 1503960366 3/12/2016 2:00:00 AM          0
##  4 1503960366 3/12/2016 3:00:00 AM          0
##  5 1503960366 3/12/2016 4:00:00 AM          0
##  6 1503960366 3/12/2016 5:00:00 AM          0
##  7 1503960366 3/12/2016 6:00:00 AM          0
##  8 1503960366 3/12/2016 7:00:00 AM          0
##  9 1503960366 3/12/2016 8:00:00 AM          0
## 10 1503960366 3/12/2016 9:00:00 AM          8
## # ℹ 24,074 more rows
distinct(hourly_steps_apr_may)
## # A tibble: 22,099 × 3
##            Id ActivityHour          StepTotal
##         <dbl> <chr>                     <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM       373
##  2 1503960366 4/12/2016 1:00:00 AM        160
##  3 1503960366 4/12/2016 2:00:00 AM        151
##  4 1503960366 4/12/2016 3:00:00 AM          0
##  5 1503960366 4/12/2016 4:00:00 AM          0
##  6 1503960366 4/12/2016 5:00:00 AM          0
##  7 1503960366 4/12/2016 6:00:00 AM          0
##  8 1503960366 4/12/2016 7:00:00 AM          0
##  9 1503960366 4/12/2016 8:00:00 AM        250
## 10 1503960366 4/12/2016 9:00:00 AM       1864
## # ℹ 22,089 more rows
distinct(minute_calories_mar_apr)
## # A tibble: 1,445,040 × 3
##            Id ActivityMinute        Calories
##         <dbl> <chr>                    <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM    0.797
##  2 1503960366 3/12/2016 12:01:00 AM    0.797
##  3 1503960366 3/12/2016 12:02:00 AM    0.797
##  4 1503960366 3/12/2016 12:03:00 AM    0.797
##  5 1503960366 3/12/2016 12:04:00 AM    0.797
##  6 1503960366 3/12/2016 12:05:00 AM    0.797
##  7 1503960366 3/12/2016 12:06:00 AM    0.797
##  8 1503960366 3/12/2016 12:07:00 AM    0.797
##  9 1503960366 3/12/2016 12:08:00 AM    0.797
## 10 1503960366 3/12/2016 12:09:00 AM    0.797
## # ℹ 1,445,030 more rows
distinct(minute_calories_apr_may)
## # A tibble: 1,325,580 × 3
##            Id ActivityMinute        Calories
##         <dbl> <chr>                    <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM    0.786
##  2 1503960366 4/12/2016 12:01:00 AM    0.786
##  3 1503960366 4/12/2016 12:02:00 AM    0.786
##  4 1503960366 4/12/2016 12:03:00 AM    0.786
##  5 1503960366 4/12/2016 12:04:00 AM    0.786
##  6 1503960366 4/12/2016 12:05:00 AM    0.944
##  7 1503960366 4/12/2016 12:06:00 AM    0.944
##  8 1503960366 4/12/2016 12:07:00 AM    0.944
##  9 1503960366 4/12/2016 12:08:00 AM    0.944
## 10 1503960366 4/12/2016 12:09:00 AM    0.944
## # ℹ 1,325,570 more rows
distinct(minute_calories_apr_may_2)
## # A tibble: 21,645 × 62
##           Id ActivityHour Calories00 Calories01 Calories02 Calories03 Calories04
##        <dbl> <chr>             <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
##  1    1.50e9 4/13/2016 1…      1.89       2.20       0.944      0.944      0.944
##  2    1.50e9 4/13/2016 1…      0.786      0.786      0.786      0.786      0.944
##  3    1.50e9 4/13/2016 2…      0.786      0.786      0.786      0.786      0.786
##  4    1.50e9 4/13/2016 3…      0.786      0.786      0.786      0.786      0.786
##  5    1.50e9 4/13/2016 4…      0.786      0.786      0.786      0.786      0.786
##  6    1.50e9 4/13/2016 5…      0.786      0.786      0.786      0.786      0.786
##  7    1.50e9 4/13/2016 6…      0.786      0.786      0.786      0.786      0.786
##  8    1.50e9 4/13/2016 7…      0.786      0.786      0.786      0.786      0.786
##  9    1.50e9 4/13/2016 8…      0.944      0.786      0.786      0.786      0.786
## 10    1.50e9 4/13/2016 9…      0.944      2.20       2.04       2.52       2.67 
## # ℹ 21,635 more rows
## # ℹ 55 more variables: Calories05 <dbl>, Calories06 <dbl>, Calories07 <dbl>,
## #   Calories08 <dbl>, Calories09 <dbl>, Calories10 <dbl>, Calories11 <dbl>,
## #   Calories12 <dbl>, Calories13 <dbl>, Calories14 <dbl>, Calories15 <dbl>,
## #   Calories16 <dbl>, Calories17 <dbl>, Calories18 <dbl>, Calories19 <dbl>,
## #   Calories20 <dbl>, Calories21 <dbl>, Calories22 <dbl>, Calories23 <dbl>,
## #   Calories24 <dbl>, Calories25 <dbl>, Calories26 <dbl>, Calories27 <dbl>, …
distinct(minute_intensities_mar_apr)
## # A tibble: 1,445,040 × 3
##            Id ActivityMinute        Intensity
##         <dbl> <chr>                     <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM         0
##  2 1503960366 3/12/2016 12:01:00 AM         0
##  3 1503960366 3/12/2016 12:02:00 AM         0
##  4 1503960366 3/12/2016 12:03:00 AM         0
##  5 1503960366 3/12/2016 12:04:00 AM         0
##  6 1503960366 3/12/2016 12:05:00 AM         0
##  7 1503960366 3/12/2016 12:06:00 AM         0
##  8 1503960366 3/12/2016 12:07:00 AM         0
##  9 1503960366 3/12/2016 12:08:00 AM         0
## 10 1503960366 3/12/2016 12:09:00 AM         0
## # ℹ 1,445,030 more rows
distinct(minute_intensities_apr_may)
## # A tibble: 1,325,580 × 3
##            Id ActivityMinute        Intensity
##         <dbl> <chr>                     <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM         0
##  2 1503960366 4/12/2016 12:01:00 AM         0
##  3 1503960366 4/12/2016 12:02:00 AM         0
##  4 1503960366 4/12/2016 12:03:00 AM         0
##  5 1503960366 4/12/2016 12:04:00 AM         0
##  6 1503960366 4/12/2016 12:05:00 AM         0
##  7 1503960366 4/12/2016 12:06:00 AM         0
##  8 1503960366 4/12/2016 12:07:00 AM         0
##  9 1503960366 4/12/2016 12:08:00 AM         0
## 10 1503960366 4/12/2016 12:09:00 AM         0
## # ℹ 1,325,570 more rows
distinct(minute_intensities_apr_may_2)
## # A tibble: 21,645 × 62
##            Id ActivityHour       Intensity00 Intensity01 Intensity02 Intensity03
##         <dbl> <chr>                    <dbl>       <dbl>       <dbl>       <dbl>
##  1 1503960366 4/13/2016 12:00:0…           1           1           0           0
##  2 1503960366 4/13/2016 1:00:00…           0           0           0           0
##  3 1503960366 4/13/2016 2:00:00…           0           0           0           0
##  4 1503960366 4/13/2016 3:00:00…           0           0           0           0
##  5 1503960366 4/13/2016 4:00:00…           0           0           0           0
##  6 1503960366 4/13/2016 5:00:00…           0           0           0           0
##  7 1503960366 4/13/2016 6:00:00…           0           0           0           0
##  8 1503960366 4/13/2016 7:00:00…           0           0           0           0
##  9 1503960366 4/13/2016 8:00:00…           0           0           0           0
## 10 1503960366 4/13/2016 9:00:00…           0           1           1           1
## # ℹ 21,635 more rows
## # ℹ 56 more variables: Intensity04 <dbl>, Intensity05 <dbl>, Intensity06 <dbl>,
## #   Intensity07 <dbl>, Intensity08 <dbl>, Intensity09 <dbl>, Intensity10 <dbl>,
## #   Intensity11 <dbl>, Intensity12 <dbl>, Intensity13 <dbl>, Intensity14 <dbl>,
## #   Intensity15 <dbl>, Intensity16 <dbl>, Intensity17 <dbl>, Intensity18 <dbl>,
## #   Intensity19 <dbl>, Intensity20 <dbl>, Intensity21 <dbl>, Intensity22 <dbl>,
## #   Intensity23 <dbl>, Intensity24 <dbl>, Intensity25 <dbl>, …
distinct(minute_mets_mar_apr)
## # A tibble: 1,445,040 × 3
##            Id ActivityMinute         METs
##         <dbl> <chr>                 <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM    10
##  2 1503960366 3/12/2016 12:01:00 AM    10
##  3 1503960366 3/12/2016 12:02:00 AM    10
##  4 1503960366 3/12/2016 12:03:00 AM    10
##  5 1503960366 3/12/2016 12:04:00 AM    10
##  6 1503960366 3/12/2016 12:05:00 AM    10
##  7 1503960366 3/12/2016 12:06:00 AM    10
##  8 1503960366 3/12/2016 12:07:00 AM    10
##  9 1503960366 3/12/2016 12:08:00 AM    10
## 10 1503960366 3/12/2016 12:09:00 AM    10
## # ℹ 1,445,030 more rows
distinct(minute_mets_apr_may)
## # A tibble: 1,325,580 × 3
##            Id ActivityMinute         METs
##         <dbl> <chr>                 <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM    10
##  2 1503960366 4/12/2016 12:01:00 AM    10
##  3 1503960366 4/12/2016 12:02:00 AM    10
##  4 1503960366 4/12/2016 12:03:00 AM    10
##  5 1503960366 4/12/2016 12:04:00 AM    10
##  6 1503960366 4/12/2016 12:05:00 AM    12
##  7 1503960366 4/12/2016 12:06:00 AM    12
##  8 1503960366 4/12/2016 12:07:00 AM    12
##  9 1503960366 4/12/2016 12:08:00 AM    12
## 10 1503960366 4/12/2016 12:09:00 AM    12
## # ℹ 1,325,570 more rows
distinct(minute_sleep_mar_apr)
## # A tibble: 198,034 × 4
##            Id date                 value       logId
##         <dbl> <chr>                <dbl>       <dbl>
##  1 1503960366 3/13/2016 2:39:30 AM     1 11114919637
##  2 1503960366 3/13/2016 2:40:30 AM     1 11114919637
##  3 1503960366 3/13/2016 2:41:30 AM     1 11114919637
##  4 1503960366 3/13/2016 2:42:30 AM     1 11114919637
##  5 1503960366 3/13/2016 2:43:30 AM     1 11114919637
##  6 1503960366 3/13/2016 2:44:30 AM     1 11114919637
##  7 1503960366 3/13/2016 2:45:30 AM     2 11114919637
##  8 1503960366 3/13/2016 2:46:30 AM     2 11114919637
##  9 1503960366 3/13/2016 2:47:30 AM     1 11114919637
## 10 1503960366 3/13/2016 2:48:30 AM     1 11114919637
## # ℹ 198,024 more rows
distinct(minute_sleep_apr_may)
## # A tibble: 187,978 × 4
##            Id date                 value       logId
##         <dbl> <chr>                <dbl>       <dbl>
##  1 1503960366 4/12/2016 2:47:30 AM     3 11380564589
##  2 1503960366 4/12/2016 2:48:30 AM     2 11380564589
##  3 1503960366 4/12/2016 2:49:30 AM     1 11380564589
##  4 1503960366 4/12/2016 2:50:30 AM     1 11380564589
##  5 1503960366 4/12/2016 2:51:30 AM     1 11380564589
##  6 1503960366 4/12/2016 2:52:30 AM     1 11380564589
##  7 1503960366 4/12/2016 2:53:30 AM     1 11380564589
##  8 1503960366 4/12/2016 2:54:30 AM     2 11380564589
##  9 1503960366 4/12/2016 2:55:30 AM     2 11380564589
## 10 1503960366 4/12/2016 2:56:30 AM     2 11380564589
## # ℹ 187,968 more rows
distinct(minute_steps_mar_apr)
## # A tibble: 1,445,040 × 3
##            Id ActivityMinute        Steps
##         <dbl> <chr>                 <dbl>
##  1 1503960366 3/12/2016 12:00:00 AM     0
##  2 1503960366 3/12/2016 12:01:00 AM     0
##  3 1503960366 3/12/2016 12:02:00 AM     0
##  4 1503960366 3/12/2016 12:03:00 AM     0
##  5 1503960366 3/12/2016 12:04:00 AM     0
##  6 1503960366 3/12/2016 12:05:00 AM     0
##  7 1503960366 3/12/2016 12:06:00 AM     0
##  8 1503960366 3/12/2016 12:07:00 AM     0
##  9 1503960366 3/12/2016 12:08:00 AM     0
## 10 1503960366 3/12/2016 12:09:00 AM     0
## # ℹ 1,445,030 more rows
distinct(minute_steps_apr_may)
## # A tibble: 1,325,580 × 3
##            Id ActivityMinute        Steps
##         <dbl> <chr>                 <dbl>
##  1 1503960366 4/12/2016 12:00:00 AM     0
##  2 1503960366 4/12/2016 12:01:00 AM     0
##  3 1503960366 4/12/2016 12:02:00 AM     0
##  4 1503960366 4/12/2016 12:03:00 AM     0
##  5 1503960366 4/12/2016 12:04:00 AM     0
##  6 1503960366 4/12/2016 12:05:00 AM     0
##  7 1503960366 4/12/2016 12:06:00 AM     0
##  8 1503960366 4/12/2016 12:07:00 AM     0
##  9 1503960366 4/12/2016 12:08:00 AM     0
## 10 1503960366 4/12/2016 12:09:00 AM     0
## # ℹ 1,325,570 more rows
distinct(minute_steps_apr_may_2)
## # A tibble: 21,645 × 62
##          Id ActivityHour Steps00 Steps01 Steps02 Steps03 Steps04 Steps05 Steps06
##       <dbl> <chr>          <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
##  1   1.50e9 4/13/2016 1…       4      16       0       0       0       9       0
##  2   1.50e9 4/13/2016 1…       0       0       0       0       0       0       0
##  3   1.50e9 4/13/2016 2…       0       0       0       0       0       0       0
##  4   1.50e9 4/13/2016 3…       0       0       0       0       0       0       0
##  5   1.50e9 4/13/2016 4…       0       0       0       0       0       0       0
##  6   1.50e9 4/13/2016 5…       0       0       0       0       0       0       0
##  7   1.50e9 4/13/2016 6…       0       0       0       0       0       0       0
##  8   1.50e9 4/13/2016 7…       0       0       0       0       0       0       0
##  9   1.50e9 4/13/2016 8…       0       0       0       0       0       0       0
## 10   1.50e9 4/13/2016 9…       0      14      10      31      37      17      25
## # ℹ 21,635 more rows
## # ℹ 53 more variables: Steps07 <dbl>, Steps08 <dbl>, Steps09 <dbl>,
## #   Steps10 <dbl>, Steps11 <dbl>, Steps12 <dbl>, Steps13 <dbl>, Steps14 <dbl>,
## #   Steps15 <dbl>, Steps16 <dbl>, Steps17 <dbl>, Steps18 <dbl>, Steps19 <dbl>,
## #   Steps20 <dbl>, Steps21 <dbl>, Steps22 <dbl>, Steps23 <dbl>, Steps24 <dbl>,
## #   Steps25 <dbl>, Steps26 <dbl>, Steps27 <dbl>, Steps28 <dbl>, Steps29 <dbl>,
## #   Steps30 <dbl>, Steps31 <dbl>, Steps32 <dbl>, Steps33 <dbl>, …
distinct(sleep_day_apr_may)
## # A tibble: 410 × 5
##            Id SleepDay       TotalSleepRecords TotalMinutesAsleep TotalTimeInBed
##         <dbl> <chr>                      <dbl>              <dbl>          <dbl>
##  1 1503960366 4/12/2016 12:…                 1                327            346
##  2 1503960366 4/13/2016 12:…                 2                384            407
##  3 1503960366 4/15/2016 12:…                 1                412            442
##  4 1503960366 4/16/2016 12:…                 2                340            367
##  5 1503960366 4/17/2016 12:…                 1                700            712
##  6 1503960366 4/19/2016 12:…                 1                304            320
##  7 1503960366 4/20/2016 12:…                 1                360            377
##  8 1503960366 4/21/2016 12:…                 1                325            364
##  9 1503960366 4/23/2016 12:…                 1                361            384
## 10 1503960366 4/24/2016 12:…                 1                430            449
## # ℹ 400 more rows
distinct(weight_log_mar_apr)
## # A tibble: 33 × 8
##            Id Date      WeightKg WeightPounds   Fat   BMI IsManualReport   LogId
##         <dbl> <chr>        <dbl>        <dbl> <dbl> <dbl> <lgl>            <dbl>
##  1 1503960366 4/5/2016…     53.3         118.    22  23.0 TRUE           1.46e12
##  2 1927972279 4/10/201…    130.          286.    NA  46.2 FALSE          1.46e12
##  3 2347167796 4/3/2016…     63.4         140.    10  24.8 TRUE           1.46e12
##  4 2873212765 4/6/2016…     56.7         125.    NA  21.5 TRUE           1.46e12
##  5 2873212765 4/7/2016…     57.2         126.    NA  21.6 TRUE           1.46e12
##  6 2891001357 4/5/2016…     88.4         195.    NA  25.0 TRUE           1.46e12
##  7 4445114986 3/30/201…     92.4         204.    NA  35.0 TRUE           1.46e12
##  8 4558609924 4/8/2016…     69.4         153.    NA  27.1 TRUE           1.46e12
##  9 4702921684 4/4/2016…     99.7         220.    NA  26.1 TRUE           1.46e12
## 10 6962181067 3/30/201…     61.5         136.    NA  24.0 TRUE           1.46e12
## # ℹ 23 more rows
distinct(weight_log_apr_may)
## # A tibble: 67 × 8
##            Id Date      WeightKg WeightPounds   Fat   BMI IsManualReport   LogId
##         <dbl> <chr>        <dbl>        <dbl> <dbl> <dbl> <lgl>            <dbl>
##  1 1503960366 5/2/2016…     52.6         116.    22  22.6 TRUE           1.46e12
##  2 1503960366 5/3/2016…     52.6         116.    NA  22.6 TRUE           1.46e12
##  3 1927972279 4/13/201…    134.          294.    NA  47.5 FALSE          1.46e12
##  4 2873212765 4/21/201…     56.7         125.    NA  21.5 TRUE           1.46e12
##  5 2873212765 5/12/201…     57.3         126.    NA  21.7 TRUE           1.46e12
##  6 4319703577 4/17/201…     72.4         160.    25  27.5 TRUE           1.46e12
##  7 4319703577 5/4/2016…     72.3         159.    NA  27.4 TRUE           1.46e12
##  8 4558609924 4/18/201…     69.7         154.    NA  27.2 TRUE           1.46e12
##  9 4558609924 4/25/201…     70.3         155.    NA  27.5 TRUE           1.46e12
## 10 4558609924 5/1/2016…     69.9         154.    NA  27.3 TRUE           1.46e12
## # ℹ 57 more rows
daily_activity_mar_apr_cleaned <- clean_names(daily_activity_mar_apr)

daily_activity_apr_may_cleaned <- clean_names(daily_activity_apr_may)

daily_calories_apr_may_cleaned <- clean_names(daily_calories_apr_may)

daily_intensities_apr_may_cleaned <- clean_names(daily_intensities_apr_may)

daily_steps_apr_may_cleaned <- clean_names(daily_steps_apr_may)

heart_rate_mar_apr_cleaned <- clean_names(heart_rate_mar_apr)

heart_rate_apr_may_cleaned <- clean_names(heart_rate_apr_may)

hourly_calories_mar_apr_cleaned <- clean_names(hourly_calories_mar_apr)

hourly_calories_apr_may_cleaned <- clean_names(hourly_calories_apr_may)

hourly_intensities_mar_apr_cleaned <- clean_names(hourly_intensities_mar_apr)

hourly_intensities_apr_may_cleaned <- clean_names(hourly_intensities_apr_may)

hourly_steps_mar_apr_cleaned <- clean_names(hourly_steps_mar_apr)

hourly_steps_apr_may_cleaned <- clean_names(hourly_steps_apr_may)

minute_calories_mar_apr_cleaned <- clean_names(minute_calories_mar_apr)

minute_calories_apr_may_cleaned <- clean_names(minute_calories_apr_may)

minute_calories_apr_may_2_cleaned <- clean_names(minute_calories_apr_may_2)

minute_intensities_mar_apr_cleaned <- clean_names(minute_intensities_mar_apr)

minute_intensities_apr_may_cleaned <- clean_names(minute_intensities_apr_may)

minute_intensities_apr_may_2_cleaned <- clean_names(minute_intensities_apr_may_2)

minute_mets_mar_apr_cleaned <- clean_names(minute_mets_mar_apr)

minute_mets_apr_may_cleaned <- clean_names(minute_mets_apr_may)

minute_sleep_mar_apr_cleaned <- clean_names(minute_sleep_mar_apr)

minute_sleep_apr_may_cleaned <- clean_names(minute_sleep_apr_may)

minute_steps_mar_apr_cleaned <- clean_names(minute_steps_mar_apr)

minute_steps_apr_may_cleaned <- clean_names(minute_steps_apr_may)

minute_steps_apr_may_2_cleaned <- clean_names(minute_steps_apr_may_2)

sleep_day_apr_may_cleaned <- clean_names(sleep_day_apr_may)

weight_log_mar_apr_cleaned <- clean_names(weight_log_mar_apr)

weight_log_apr_may_cleaned <- clean_names(weight_log_apr_may)
colnames(daily_activity_mar_apr_cleaned)
##  [1] "id"                         "activity_date"             
##  [3] "total_steps"                "total_distance"            
##  [5] "tracker_distance"           "logged_activities_distance"
##  [7] "very_active_distance"       "moderately_active_distance"
##  [9] "light_active_distance"      "sedentary_active_distance" 
## [11] "very_active_minutes"        "fairly_active_minutes"     
## [13] "lightly_active_minutes"     "sedentary_minutes"         
## [15] "calories"
colnames(daily_activity_apr_may_cleaned)
##  [1] "id"                         "activity_date"             
##  [3] "total_steps"                "total_distance"            
##  [5] "tracker_distance"           "logged_activities_distance"
##  [7] "very_active_distance"       "moderately_active_distance"
##  [9] "light_active_distance"      "sedentary_active_distance" 
## [11] "very_active_minutes"        "fairly_active_minutes"     
## [13] "lightly_active_minutes"     "sedentary_minutes"         
## [15] "calories"
colnames(daily_calories_apr_may_cleaned)
## [1] "id"           "activity_day" "calories"
colnames(daily_intensities_apr_may_cleaned)
##  [1] "id"                         "activity_day"              
##  [3] "sedentary_minutes"          "lightly_active_minutes"    
##  [5] "fairly_active_minutes"      "very_active_minutes"       
##  [7] "sedentary_active_distance"  "light_active_distance"     
##  [9] "moderately_active_distance" "very_active_distance"
colnames(daily_steps_apr_may_cleaned)
## [1] "id"           "activity_day" "step_total"
colnames(heart_rate_mar_apr_cleaned)
## [1] "id"    "time"  "value"
colnames(heart_rate_apr_may_cleaned)
## [1] "id"    "time"  "value"
colnames(hourly_calories_mar_apr_cleaned)
## [1] "id"            "activity_hour" "calories"
colnames(hourly_calories_apr_may_cleaned)
## [1] "id"            "activity_hour" "calories"
colnames(hourly_intensities_mar_apr_cleaned)
## [1] "id"                "activity_hour"     "total_intensity"  
## [4] "average_intensity"
colnames(hourly_intensities_apr_may_cleaned)
## [1] "id"                "activity_hour"     "total_intensity"  
## [4] "average_intensity"
colnames(hourly_steps_mar_apr_cleaned)
## [1] "id"            "activity_hour" "step_total"
colnames(hourly_steps_apr_may_cleaned)
## [1] "id"            "activity_hour" "step_total"
colnames(minute_calories_mar_apr_cleaned)
## [1] "id"              "activity_minute" "calories"
colnames(minute_calories_apr_may_cleaned)
## [1] "id"              "activity_minute" "calories"
colnames(minute_calories_apr_may_2_cleaned)
##  [1] "id"            "activity_hour" "calories00"    "calories01"   
##  [5] "calories02"    "calories03"    "calories04"    "calories05"   
##  [9] "calories06"    "calories07"    "calories08"    "calories09"   
## [13] "calories10"    "calories11"    "calories12"    "calories13"   
## [17] "calories14"    "calories15"    "calories16"    "calories17"   
## [21] "calories18"    "calories19"    "calories20"    "calories21"   
## [25] "calories22"    "calories23"    "calories24"    "calories25"   
## [29] "calories26"    "calories27"    "calories28"    "calories29"   
## [33] "calories30"    "calories31"    "calories32"    "calories33"   
## [37] "calories34"    "calories35"    "calories36"    "calories37"   
## [41] "calories38"    "calories39"    "calories40"    "calories41"   
## [45] "calories42"    "calories43"    "calories44"    "calories45"   
## [49] "calories46"    "calories47"    "calories48"    "calories49"   
## [53] "calories50"    "calories51"    "calories52"    "calories53"   
## [57] "calories54"    "calories55"    "calories56"    "calories57"   
## [61] "calories58"    "calories59"
colnames(minute_intensities_mar_apr_cleaned)
## [1] "id"              "activity_minute" "intensity"
colnames(minute_intensities_apr_may_cleaned)
## [1] "id"              "activity_minute" "intensity"
colnames(minute_intensities_apr_may_2_cleaned)
##  [1] "id"            "activity_hour" "intensity00"   "intensity01"  
##  [5] "intensity02"   "intensity03"   "intensity04"   "intensity05"  
##  [9] "intensity06"   "intensity07"   "intensity08"   "intensity09"  
## [13] "intensity10"   "intensity11"   "intensity12"   "intensity13"  
## [17] "intensity14"   "intensity15"   "intensity16"   "intensity17"  
## [21] "intensity18"   "intensity19"   "intensity20"   "intensity21"  
## [25] "intensity22"   "intensity23"   "intensity24"   "intensity25"  
## [29] "intensity26"   "intensity27"   "intensity28"   "intensity29"  
## [33] "intensity30"   "intensity31"   "intensity32"   "intensity33"  
## [37] "intensity34"   "intensity35"   "intensity36"   "intensity37"  
## [41] "intensity38"   "intensity39"   "intensity40"   "intensity41"  
## [45] "intensity42"   "intensity43"   "intensity44"   "intensity45"  
## [49] "intensity46"   "intensity47"   "intensity48"   "intensity49"  
## [53] "intensity50"   "intensity51"   "intensity52"   "intensity53"  
## [57] "intensity54"   "intensity55"   "intensity56"   "intensity57"  
## [61] "intensity58"   "intensity59"
colnames(minute_mets_mar_apr_cleaned)
## [1] "id"              "activity_minute" "me_ts"
colnames(minute_mets_apr_may_cleaned)
## [1] "id"              "activity_minute" "me_ts"
colnames(minute_sleep_mar_apr_cleaned)
## [1] "id"     "date"   "value"  "log_id"
colnames(minute_sleep_apr_may_cleaned)
## [1] "id"     "date"   "value"  "log_id"
colnames(minute_steps_mar_apr_cleaned)
## [1] "id"              "activity_minute" "steps"
colnames(minute_steps_apr_may_cleaned)
## [1] "id"              "activity_minute" "steps"
colnames(minute_steps_apr_may_2_cleaned)
##  [1] "id"            "activity_hour" "steps00"       "steps01"      
##  [5] "steps02"       "steps03"       "steps04"       "steps05"      
##  [9] "steps06"       "steps07"       "steps08"       "steps09"      
## [13] "steps10"       "steps11"       "steps12"       "steps13"      
## [17] "steps14"       "steps15"       "steps16"       "steps17"      
## [21] "steps18"       "steps19"       "steps20"       "steps21"      
## [25] "steps22"       "steps23"       "steps24"       "steps25"      
## [29] "steps26"       "steps27"       "steps28"       "steps29"      
## [33] "steps30"       "steps31"       "steps32"       "steps33"      
## [37] "steps34"       "steps35"       "steps36"       "steps37"      
## [41] "steps38"       "steps39"       "steps40"       "steps41"      
## [45] "steps42"       "steps43"       "steps44"       "steps45"      
## [49] "steps46"       "steps47"       "steps48"       "steps49"      
## [53] "steps50"       "steps51"       "steps52"       "steps53"      
## [57] "steps54"       "steps55"       "steps56"       "steps57"      
## [61] "steps58"       "steps59"
colnames(sleep_day_apr_may_cleaned)
## [1] "id"                   "sleep_day"            "total_sleep_records" 
## [4] "total_minutes_asleep" "total_time_in_bed"
colnames(weight_log_mar_apr_cleaned)
## [1] "id"               "date"             "weight_kg"        "weight_pounds"   
## [5] "fat"              "bmi"              "is_manual_report" "log_id"
colnames(weight_log_apr_may_cleaned)
## [1] "id"               "date"             "weight_kg"        "weight_pounds"   
## [5] "fat"              "bmi"              "is_manual_report" "log_id"
daily_activity_mar_apr_cleaned$activity_date <- as.Date(daily_activity_mar_apr_cleaned$activity_date, format = "%m/%d/%Y")
daily_activity_mar_apr_cleaned$activity_date <- strftime(daily_activity_mar_apr_cleaned$activity_date, format = "%Y-%m-%d")

daily_activity_apr_may_cleaned$activity_date <- as.Date(daily_activity_apr_may_cleaned$activity_date, format = "%m/%d/%Y")
daily_activity_apr_may_cleaned$activity_date <- strftime(daily_activity_apr_may_cleaned$activity_date, format = "%Y-%m-%d")

daily_calories_apr_may_cleaned$activity_day <- as.Date(daily_calories_apr_may_cleaned$activity_day, format = "%m/%d/%Y")
daily_calories_apr_may_cleaned$activity_day <- strftime(daily_calories_apr_may_cleaned$activity_day, format = "%Y-%m-%d")

daily_intensities_apr_may_cleaned$activity_day <- as.Date(daily_intensities_apr_may_cleaned$activity_day, format = "%m/%d/%Y")
daily_intensities_apr_may_cleaned$activity_day<- strftime(daily_intensities_apr_may_cleaned$activity_day, format = "%Y-%m-%d")

daily_steps_apr_may_cleaned$activity_day <- as.Date(daily_steps_apr_may_cleaned$activity_day, format = "%m/%d/%Y")
daily_steps_apr_may_cleaned$activity_day <- strftime(daily_steps_apr_may_cleaned$activity_day, format = "%Y-%m-%d")

heart_rate_mar_apr_cleaned$time <- strptime(heart_rate_mar_apr_cleaned$time, format = "%m/%d/%Y %I:%M:%S %p")
heart_rate_mar_apr_cleaned$time <- strftime(heart_rate_mar_apr_cleaned$time, format = "%Y-%m-%d %H:%M:%S")

heart_rate_apr_may_cleaned$time <- strptime(heart_rate_apr_may_cleaned$time, format = "%m/%d/%Y %I:%M:%S %p")
heart_rate_apr_may_cleaned$time <- strftime(heart_rate_apr_may_cleaned$time, format = "%Y-%m-%d %H:%M:%S")

hourly_calories_mar_apr_cleaned$activity_hour <- strptime(hourly_calories_mar_apr_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
hourly_calories_mar_apr_cleaned$activity_hour <- strftime(hourly_calories_mar_apr_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

hourly_calories_apr_may_cleaned$activity_hour <- strptime(hourly_calories_apr_may_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
hourly_calories_apr_may_cleaned$activity_hour <- strftime(hourly_calories_apr_may_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

hourly_intensities_mar_apr_cleaned$activity_hour <- strptime(hourly_intensities_mar_apr_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
hourly_intensities_mar_apr_cleaned$activity_hour <- strftime(hourly_intensities_mar_apr_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

hourly_intensities_apr_may_cleaned$activity_hour <- strptime(hourly_intensities_apr_may_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
hourly_intensities_apr_may_cleaned$activity_hour <- strftime(hourly_intensities_apr_may_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

hourly_steps_mar_apr_cleaned$activity_hour <- strptime(hourly_steps_mar_apr_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
hourly_steps_mar_apr_cleaned$activity_hour <- strftime(hourly_steps_mar_apr_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

hourly_steps_apr_may_cleaned$activity_hour <- strptime(hourly_steps_apr_may_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
hourly_steps_apr_may_cleaned$activity_hour <- strftime(hourly_steps_apr_may_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

minute_calories_mar_apr_cleaned$activity_minute <- strptime(minute_calories_mar_apr_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_calories_mar_apr_cleaned$activity_minute <- strftime(minute_calories_mar_apr_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_calories_apr_may_cleaned$activity_minute <- strptime(minute_calories_apr_may_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_calories_apr_may_cleaned$activity_minute <- strftime(minute_calories_apr_may_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_calories_apr_may_2_cleaned$activity_hour <- strptime(minute_calories_apr_may_2_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
minute_calories_apr_may_2_cleaned$activity_hour <- strftime(minute_calories_apr_may_2_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

minute_intensities_mar_apr_cleaned$activity_minute <- strptime(minute_intensities_mar_apr_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_intensities_mar_apr_cleaned$activity_minute <- strftime(minute_intensities_mar_apr_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_intensities_apr_may_cleaned$activity_minute <- strptime(minute_intensities_apr_may_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_intensities_apr_may_cleaned$activity_minute <- strftime(minute_intensities_apr_may_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_intensities_apr_may_2_cleaned$activity_hour <- strptime(minute_intensities_apr_may_2_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
minute_intensities_apr_may_2_cleaned$activity_hour <- strftime(minute_intensities_apr_may_2_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

minute_mets_mar_apr_cleaned$activity_minute <- strptime(minute_mets_mar_apr_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_mets_mar_apr_cleaned$activity_minute <- strftime(minute_mets_mar_apr_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_mets_apr_may_cleaned$activity_minute <- strptime(minute_mets_apr_may_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_mets_apr_may_cleaned$activity_minute <- strftime(minute_mets_apr_may_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_sleep_mar_apr_cleaned$date <- strptime(minute_sleep_mar_apr_cleaned$date, format = "%m/%d/%Y %I:%M:%S %p")
minute_sleep_mar_apr_cleaned$date <- strftime(minute_sleep_mar_apr_cleaned$date, format = "%Y-%m-%d %H:%M:%S")

minute_sleep_apr_may_cleaned$date <- strptime(minute_sleep_apr_may_cleaned$date, format = "%m/%d/%Y %I:%M:%S %p")
minute_sleep_apr_may_cleaned$date <- strftime(minute_sleep_apr_may_cleaned$date, format = "%Y-%m-%d %H:%M:%S")

minute_steps_mar_apr_cleaned$activity_minute <- strptime(minute_steps_mar_apr_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_steps_mar_apr_cleaned$activity_minute <- strftime(minute_steps_mar_apr_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_steps_apr_may_cleaned$activity_minute <- strptime(minute_steps_apr_may_cleaned$activity_minute, format = "%m/%d/%Y %I:%M:%S %p")
minute_steps_apr_may_cleaned$activity_minute <- strftime(minute_steps_apr_may_cleaned$activity_minute, format = "%Y-%m-%d %H:%M:%S")

minute_steps_apr_may_2_cleaned$activity_hour <- strptime(minute_steps_apr_may_2_cleaned$activity_hour, format = "%m/%d/%Y %I:%M:%S %p")
minute_steps_apr_may_2_cleaned$activity_hour <- strftime(minute_steps_apr_may_2_cleaned$activity_hour, format = "%Y-%m-%d %H:%M:%S")

sleep_day_apr_may_cleaned$sleep_day <- strptime(sleep_day_apr_may_cleaned$sleep_day, format = "%m/%d/%Y %I:%M:%S %p")
sleep_day_apr_may_cleaned$sleep_day <- strftime(sleep_day_apr_may_cleaned$sleep_day, format = "%Y-%m-%d %H:%M:%S")

weight_log_mar_apr_cleaned$date <- strptime(weight_log_mar_apr_cleaned$date, format = "%m/%d/%Y %I:%M:%S %p")
weight_log_mar_apr_cleaned$date <- strftime(weight_log_mar_apr_cleaned$date, format = "%Y-%m-%d %H:%M:%S")

weight_log_apr_may_cleaned$date <- strptime(weight_log_apr_may_cleaned$date, format = "%m/%d/%Y %I:%M:%S %p")
weight_log_apr_may_cleaned$date <- strftime(weight_log_apr_may_cleaned$date, format = "%Y-%m-%d %H:%M:%S")
heart_rate_mar_apr_cleaned <- separate(heart_rate_mar_apr_cleaned, col = "time", into = c("activity_date", "activity_time"), sep = " ")

heart_rate_apr_may_cleaned <- separate(heart_rate_apr_may_cleaned, col = "time", into = c("activity_date", "activity_time"), sep = " ")

hourly_calories_mar_apr_cleaned <- separate(hourly_calories_mar_apr_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

hourly_calories_apr_may_cleaned <- separate(hourly_calories_apr_may_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

hourly_intensities_mar_apr_cleaned <- separate(hourly_intensities_mar_apr_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

hourly_intensities_apr_may_cleaned <- separate(hourly_intensities_apr_may_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

hourly_steps_mar_apr_cleaned <- separate(hourly_steps_mar_apr_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

hourly_steps_apr_may_cleaned <- separate(hourly_steps_apr_may_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

minute_calories_mar_apr_cleaned <- separate(minute_calories_mar_apr_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_calories_apr_may_cleaned <- separate(minute_calories_apr_may_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_calories_apr_may_2_cleaned <- separate(minute_calories_apr_may_2_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

minute_intensities_mar_apr_cleaned <- separate(minute_intensities_mar_apr_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_intensities_apr_may_cleaned <- separate(minute_intensities_apr_may_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_intensities_apr_may_2_cleaned <- separate(minute_intensities_apr_may_2_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

minute_mets_mar_apr_cleaned <- separate(minute_mets_mar_apr_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_mets_apr_may_cleaned <- separate(minute_mets_apr_may_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_sleep_mar_apr_cleaned <- separate(minute_sleep_mar_apr_cleaned, col = "date", into = c("activity_date", "activity_time"), sep = " ")

minute_sleep_apr_may_cleaned <- separate(minute_sleep_apr_may_cleaned, col = "date", into = c("activity_date", "activity_time"), sep = " ")

minute_steps_mar_apr_cleaned <- separate(minute_steps_mar_apr_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_steps_apr_may_cleaned <- separate(minute_steps_apr_may_cleaned, col = "activity_minute", into = c("activity_date", "activity_time"), sep = " ")

minute_steps_apr_may_2_cleaned <- separate(minute_steps_apr_may_2_cleaned, col = "activity_hour", into = c("activity_date", "activity_time"), sep = " ")

sleep_day_apr_may_cleaned <- separate(sleep_day_apr_may_cleaned, col = "sleep_day", into = c("activity_date", "activity_time"), sep = " ")

weight_log_mar_apr_cleaned <- separate(weight_log_mar_apr_cleaned, col = "date", into = c("activity_date", "activity_time"), sep = " ")

weight_log_apr_may_cleaned <- separate(weight_log_apr_may_cleaned, col = "date",  into = c("activity_date", "activity_time"), sep = " ")
colnames(daily_calories_apr_may_cleaned)[2] <- "activity_date"

colnames(daily_intensities_apr_may_cleaned)[2] <- "activity_date"

colnames(daily_steps_apr_may_cleaned)[2] <- "activity_date"

colnames(daily_steps_apr_may_cleaned)[3] <- "total_steps"

colnames(heart_rate_mar_apr_cleaned)[4] <- "heart_rate"

colnames(heart_rate_apr_may_cleaned)[4] <- "heart_rate"

colnames(hourly_steps_mar_apr_cleaned)[4] <- "total_steps"

colnames(hourly_steps_apr_may_cleaned)[4] <- "total_steps"

colnames(minute_intensities_mar_apr_cleaned)[4] <- "total_intensity"

colnames(minute_intensities_apr_may_cleaned)[4] <- "total_intensity"

colnames(minute_mets_mar_apr_cleaned)[4] <- "mets"

colnames(minute_mets_apr_may_cleaned)[4] <- "mets"

colnames(minute_sleep_mar_apr_cleaned)[4] <- "sleep_m"

colnames(minute_sleep_apr_may_cleaned)[4] <- "sleep_m"

colnames(minute_steps_mar_apr_cleaned)[4] <- "total_steps"

colnames(minute_steps_apr_may_cleaned)[4] <- "total_steps"
daily_activity_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 457
Number of columns 15
_______________________
Column type frequency:
character 1
numeric 14
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.628595e+09 2.293781e+09 1503960366 2.347168e+09 4.057193e+09 6.391747e+09 8.877689e+09
total_steps 0 1 6.546560e+03 5.398490e+03 0 1.988000e+03 5.986000e+03 1.019800e+04 2.849700e+04
total_distance 0 1 4.660000e+00 4.080000e+00 0 1.410000e+00 4.090000e+00 7.160000e+00 2.753000e+01
tracker_distance 0 1 4.610000e+00 4.070000e+00 0 1.280000e+00 4.090000e+00 7.110000e+00 2.753000e+01
logged_activities_distance 0 1 1.800000e-01 8.500000e-01 0 0.000000e+00 0.000000e+00 0.000000e+00 6.730000e+00
very_active_distance 0 1 1.180000e+00 2.490000e+00 0 0.000000e+00 0.000000e+00 1.310000e+00 2.192000e+01
moderately_active_distance 0 1 4.800000e-01 8.300000e-01 0 0.000000e+00 2.000000e-02 6.700000e-01 6.400000e+00
light_active_distance 0 1 2.890000e+00 2.240000e+00 0 8.700000e-01 2.930000e+00 4.460000e+00 1.251000e+01
sedentary_active_distance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e-01
very_active_minutes 0 1 1.662000e+01 2.892000e+01 0 0.000000e+00 0.000000e+00 2.500000e+01 2.020000e+02
fairly_active_minutes 0 1 1.307000e+01 3.621000e+01 0 0.000000e+00 1.000000e+00 1.600000e+01 6.600000e+02
lightly_active_minutes 0 1 1.700700e+02 1.222100e+02 0 6.400000e+01 1.810000e+02 2.570000e+02 7.200000e+02
sedentary_minutes 0 1 9.952800e+02 3.370200e+02 32 7.280000e+02 1.057000e+03 1.285000e+03 1.440000e+03
calories 0 1 2.189450e+03 8.154800e+02 0 1.776000e+03 2.062000e+03 2.667000e+03 4.562000e+03
daily_activity_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 940
Number of columns 15
_______________________
Column type frequency:
character 1
numeric 14
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.855407e+09 2.424805e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
total_steps 0 1 7.637910e+03 5.087150e+03 0 3.789750e+03 7.405500e+03 1.072700e+04 3.601900e+04
total_distance 0 1 5.490000e+00 3.920000e+00 0 2.620000e+00 5.240000e+00 7.710000e+00 2.803000e+01
tracker_distance 0 1 5.480000e+00 3.910000e+00 0 2.620000e+00 5.240000e+00 7.710000e+00 2.803000e+01
logged_activities_distance 0 1 1.100000e-01 6.200000e-01 0 0.000000e+00 0.000000e+00 0.000000e+00 4.940000e+00
very_active_distance 0 1 1.500000e+00 2.660000e+00 0 0.000000e+00 2.100000e-01 2.050000e+00 2.192000e+01
moderately_active_distance 0 1 5.700000e-01 8.800000e-01 0 0.000000e+00 2.400000e-01 8.000000e-01 6.480000e+00
light_active_distance 0 1 3.340000e+00 2.040000e+00 0 1.950000e+00 3.360000e+00 4.780000e+00 1.071000e+01
sedentary_active_distance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.100000e-01
very_active_minutes 0 1 2.116000e+01 3.284000e+01 0 0.000000e+00 4.000000e+00 3.200000e+01 2.100000e+02
fairly_active_minutes 0 1 1.356000e+01 1.999000e+01 0 0.000000e+00 6.000000e+00 1.900000e+01 1.430000e+02
lightly_active_minutes 0 1 1.928100e+02 1.091700e+02 0 1.270000e+02 1.990000e+02 2.640000e+02 5.180000e+02
sedentary_minutes 0 1 9.912100e+02 3.012700e+02 0 7.297500e+02 1.057500e+03 1.229500e+03 1.440000e+03
calories 0 1 2.303610e+03 7.181700e+02 0 1.828500e+03 2.134000e+03 2.793250e+03 4.900000e+03
daily_calories_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 940
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.855407e+09 2.424805e+09 1503960366 2320127002.0 4445114986 6.962181e+09 8877689391
calories 0 1 2.303610e+03 7.181700e+02 0 1828.5 2134 2.793250e+03 4900
daily_intensities_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 940
Number of columns 10
_______________________
Column type frequency:
character 1
numeric 9
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.855407e+09 2.424805e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
sedentary_minutes 0 1 9.912100e+02 3.012700e+02 0 7.297500e+02 1.057500e+03 1.229500e+03 1.440000e+03
lightly_active_minutes 0 1 1.928100e+02 1.091700e+02 0 1.270000e+02 1.990000e+02 2.640000e+02 5.180000e+02
fairly_active_minutes 0 1 1.356000e+01 1.999000e+01 0 0.000000e+00 6.000000e+00 1.900000e+01 1.430000e+02
very_active_minutes 0 1 2.116000e+01 3.284000e+01 0 0.000000e+00 4.000000e+00 3.200000e+01 2.100000e+02
sedentary_active_distance 0 1 0.000000e+00 1.000000e-02 0 0.000000e+00 0.000000e+00 0.000000e+00 1.100000e-01
light_active_distance 0 1 3.340000e+00 2.040000e+00 0 1.950000e+00 3.360000e+00 4.780000e+00 1.071000e+01
moderately_active_distance 0 1 5.700000e-01 8.800000e-01 0 0.000000e+00 2.400000e-01 8.000000e-01 6.480000e+00
very_active_distance 0 1 1.500000e+00 2.660000e+00 0 0.000000e+00 2.100000e-01 2.050000e+00 2.192000e+01
daily_steps_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 940
Number of columns 3
_______________________
Column type frequency:
character 1
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.855407e+09 2.424805e+09 1503960366 2.320127e+09 4445114986.0 6962181067 8877689391
total_steps 0 1 7.637910e+03 5.087150e+03 0 3.789750e+03 7405.5 10727 36019
heart_rate_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1154681
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 15 0
activity_time 0 1 8 8 0 85617 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 5.352122e+09 2.033584e+09 2022484408 4020332650 5553957443 6962181067 8877689391
heart_rate 0 1 7.976000e+01 1.873000e+01 36 66 77 90 185
heart_rate_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 2483658
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 86046 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 5.513765e+09 1950223761.0 2022484408 4388161847 5553957443 6962181067 8877689391
heart_rate 0 1 7.733000e+01 19.4 36 63 73 88 203
hourly_calories_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 24084
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2421565819.2 1503960366 2347167796 4558609924 6962181067 8877689391
calories 0 1 9.427000e+01 59.4 42 61 77 104 933
hourly_calories_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 22099
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.848235e+09 2.4225e+09 1503960366 2320127002 4445114986 6962181067 8877689391
calories 0 1 9.739000e+01 6.0700e+01 42 63 83 108 948
hourly_intensities_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 24084
Number of columns 5
_______________________
Column type frequency:
character 2
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2.421566e+09 1503960366 2347167796 4.55861e+09 6.962181e+09 8877689391
total_intensity 0 1 1.083000e+01 2.031000e+01 0 0 1.00000e+00 1.400000e+01 180
average_intensity 0 1 1.800000e-01 3.400000e-01 0 0 2.00000e-02 2.300000e-01 3
hourly_intensities_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 22099
Number of columns 5
_______________________
Column type frequency:
character 2
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.848235e+09 2.4225e+09 1503960366 2320127002 4.445115e+09 6.962181e+09 8877689391
total_intensity 0 1 1.204000e+01 2.1130e+01 0 0 3.000000e+00 1.600000e+01 180
average_intensity 0 1 2.000000e-01 3.5000e-01 0 0 5.000000e-02 2.700000e-01 3
hourly_steps_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 24084
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2.421566e+09 1503960366 2347167796 4558609924 6962181067 8877689391
total_steps 0 1 2.862200e+02 6.649200e+02 0 0 10 289 10565
hourly_steps_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 22099
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.848235e+09 2.4225e+09 1503960366 2320127002 4445114986 6962181067 8877689391
total_steps 0 1 3.201700e+02 6.9038e+02 0 0 40 357 10554
minute_calories_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1445040
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2.421516e+09 1503960366 2.347168e+09 4.55861e+09 6.962181e+09 8.877689e+09
calories 0 1 1.570000e+00 1.360000e+00 0 9.400000e-01 1.22000e+00 1.410000e+00 2.301000e+01
minute_calories_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1325580
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.847898e+09 2.422313e+09 1503960366 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
calories 0 1 1.620000e+00 1.410000e+00 0 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
minute_calories_apr_may_2_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 21645
Number of columns 63
_______________________
Column type frequency:
character 2
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.836965e+09 2.424088e+09 1.50396e+09 2.320127e+09 4.445115e+09 6.962181e+09 8.877689e+09
calories00 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories01 0 1 1.630000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories02 0 1 1.640000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories03 0 1 1.640000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories04 0 1 1.640000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories05 0 1 1.640000e+00 1.440000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories06 0 1 1.640000e+00 1.440000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories07 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories08 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories09 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.676000e+01
calories10 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.744000e+01
calories11 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.676000e+01
calories12 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.744000e+01
calories13 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.668000e+01
calories14 0 1 1.610000e+00 1.400000e+00 0.00000e+00 9.400000e-01 1.220000e+00 1.430000e+00 1.693000e+01
calories15 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.719000e+01
calories16 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.719000e+01
calories17 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.744000e+01
calories18 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.693000e+01
calories19 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.668000e+01
calories20 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.630000e+01
calories21 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.683000e+01
calories22 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.778000e+01
calories23 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.778000e+01
calories24 0 1 1.610000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.735000e+01
calories25 0 1 1.620000e+00 1.420000e+00 0.00000e+00 9.400000e-01 1.220000e+00 1.430000e+00 1.709000e+01
calories26 0 1 1.610000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.699000e+01
calories27 0 1 1.620000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.723000e+01
calories28 0 1 1.620000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.683000e+01
calories29 0 1 1.620000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.735000e+01
calories30 0 1 1.620000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.735000e+01
calories31 0 1 1.630000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.761000e+01
calories32 0 1 1.630000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.761000e+01
calories33 0 1 1.640000e+00 1.440000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.761000e+01
calories34 0 1 1.630000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.787000e+01
calories35 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.787000e+01
calories36 0 1 1.640000e+00 1.460000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories37 0 1 1.640000e+00 1.450000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories38 0 1 1.630000e+00 1.450000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories39 0 1 1.630000e+00 1.430000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories40 0 1 1.630000e+00 1.420000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories41 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories42 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories43 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories44 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories45 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories46 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories47 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories48 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories49 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories50 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories51 0 1 1.610000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories52 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories53 0 1 1.620000e+00 1.400000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories54 0 1 1.620000e+00 1.410000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories55 0 1 1.620000e+00 1.390000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.975000e+01
calories56 0 1 1.610000e+00 1.380000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories57 0 1 1.610000e+00 1.370000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories58 0 1 1.610000e+00 1.370000e+00 7.00000e-01 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
calories59 0 1 1.610000e+00 1.370000e+00 0.00000e+00 9.400000e-01 1.220000e+00 1.430000e+00 1.973000e+01
minute_intensities_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1445040
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2.421516e+09 1503960366 2347167796 4558609924 6962181067 8877689391
total_intensity 0 1 1.800000e-01 4.900000e-01 0 0 0 0 3
minute_intensities_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1325580
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4847897691.9 2.422313e+09 1503960366 2320127002 4445114986 6962181067 8877689391
total_intensity 0 1 0.2 5.200000e-01 0 0 0 0 3
minute_intensities_apr_may_2_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 21645
Number of columns 63
_______________________
Column type frequency:
character 2
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.836965e+09 2.424088e+09 1503960366 2320127002 4445114986 6962181067 8877689391
intensity00 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity01 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity02 0 1 2.100000e-01 5.200000e-01 0 0 0 0 3
intensity03 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity04 0 1 2.100000e-01 5.200000e-01 0 0 0 0 3
intensity05 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity06 0 1 2.100000e-01 5.200000e-01 0 0 0 0 3
intensity07 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity08 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity09 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity10 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity11 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity12 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity13 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity14 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity15 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity16 0 1 1.900000e-01 5.200000e-01 0 0 0 0 3
intensity17 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity18 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity19 0 1 2.000000e-01 5.300000e-01 0 0 0 0 3
intensity20 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity21 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity22 0 1 2.000000e-01 5.300000e-01 0 0 0 0 3
intensity23 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity24 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity25 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity26 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity27 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity28 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity29 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity30 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity31 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity32 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
intensity33 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
intensity34 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity35 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
intensity36 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
intensity37 0 1 2.100000e-01 5.300000e-01 0 0 0 0 3
intensity38 0 1 2.000000e-01 5.300000e-01 0 0 0 0 3
intensity39 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity40 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity41 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity42 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity43 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity44 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity45 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity46 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity47 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity48 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity49 0 1 2.000000e-01 5.200000e-01 0 0 0 0 3
intensity50 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity51 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity52 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity53 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity54 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity55 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity56 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity57 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity58 0 1 2.000000e-01 5.100000e-01 0 0 0 0 3
intensity59 0 1 2.000000e-01 5.000000e-01 0 0 0 0 3
minute_mets_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1445040
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2.421516e+09 1503960366 2347167796 4558609924 6962181067 8877689391
mets 0 1 1.424000e+01 1.154000e+01 0 10 10 11 189
minute_mets_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1325580
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.847898e+09 2.422313e+09 1503960366 2320127002 4445114986 6962181067 8877689391
mets 0 1 1.469000e+01 1.206000e+01 0 10 10 11 157
minute_sleep_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 198559
Number of columns 5
_______________________
Column type frequency:
character 2
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 33 0
activity_time 0 1 8 8 0 3927 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.824304e+09 2.173935e+09 1503960366 2347167796 4702921684 6775888955 8792009665
sleep_m 0 1 1.090000e+00 3.100000e-01 1 1 1 1 3
log_id 0 1 1.124161e+10 7.969858e+07 11103653021 11165512026 11243951252 11310735495 11374876178
minute_sleep_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 188521
Number of columns 5
_______________________
Column type frequency:
character 2
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 2880 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.996595e+09 2.066950e+09 1503960366 3977333714 4702921684 6962181067 8792009665
sleep_m 0 1 1.100000e+00 3.300000e-01 1 1 1 1 3
log_id 0 1 1.149611e+10 6.822863e+07 11372227280 11439308639 11501142214 11552534115 11616251768
minute_steps_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1445040
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 32 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.889424e+09 2.421516e+09 1503960366 2347167796 4558609924 6962181067 8877689391
total_steps 0 1 4.770000e+00 1.722000e+01 0 0 0 0 204
minute_steps_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 1325580
Number of columns 4
_______________________
Column type frequency:
character 2
numeric 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 1440 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.847898e+09 2.422313e+09 1503960366 2320127002 4445114986 6962181067 8877689391
total_steps 0 1 5.340000e+00 1.813000e+01 0 0 0 0 220
minute_steps_apr_may_2_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 21645
Number of columns 63
_______________________
Column type frequency:
character 2
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 24 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 4.836965e+09 2.424088e+09 1503960366 2320127002 4445114986 6962181067 8877689391
steps00 0 1 5.300000e+00 1.778000e+01 0 0 0 0 186
steps01 0 1 5.340000e+00 1.768000e+01 0 0 0 0 180
steps02 0 1 5.530000e+00 1.808000e+01 0 0 0 0 182
steps03 0 1 5.470000e+00 1.811000e+01 0 0 0 0 182
steps04 0 1 5.460000e+00 1.829000e+01 0 0 0 0 181
steps05 0 1 5.590000e+00 1.857000e+01 0 0 0 0 180
steps06 0 1 5.560000e+00 1.848000e+01 0 0 0 0 181
steps07 0 1 5.410000e+00 1.834000e+01 0 0 0 0 183
steps08 0 1 5.360000e+00 1.821000e+01 0 0 0 0 180
steps09 0 1 5.360000e+00 1.819000e+01 0 0 0 0 183
steps10 0 1 5.340000e+00 1.834000e+01 0 0 0 0 180
steps11 0 1 5.290000e+00 1.818000e+01 0 0 0 0 181
steps12 0 1 5.300000e+00 1.830000e+01 0 0 0 0 181
steps13 0 1 5.260000e+00 1.835000e+01 0 0 0 0 180
steps14 0 1 5.340000e+00 1.840000e+01 0 0 0 0 182
steps15 0 1 5.280000e+00 1.829000e+01 0 0 0 0 179
steps16 0 1 5.210000e+00 1.815000e+01 0 0 0 0 180
steps17 0 1 5.290000e+00 1.822000e+01 0 0 0 0 183
steps18 0 1 5.350000e+00 1.830000e+01 0 0 0 0 180
steps19 0 1 5.420000e+00 1.849000e+01 0 0 0 0 182
steps20 0 1 5.300000e+00 1.844000e+01 0 0 0 0 179
steps21 0 1 5.290000e+00 1.837000e+01 0 0 0 0 185
steps22 0 1 5.530000e+00 1.871000e+01 0 0 0 0 182
steps23 0 1 5.350000e+00 1.839000e+01 0 0 0 0 187
steps24 0 1 5.310000e+00 1.827000e+01 0 0 0 0 180
steps25 0 1 5.300000e+00 1.830000e+01 0 0 0 0 181
steps26 0 1 5.250000e+00 1.816000e+01 0 0 0 0 186
steps27 0 1 5.310000e+00 1.822000e+01 0 0 0 0 180
steps28 0 1 5.270000e+00 1.802000e+01 0 0 0 0 181
steps29 0 1 5.260000e+00 1.802000e+01 0 0 0 0 183
steps30 0 1 5.400000e+00 1.832000e+01 0 0 0 0 181
steps31 0 1 5.360000e+00 1.812000e+01 0 0 0 0 181
steps32 0 1 5.440000e+00 1.820000e+01 0 0 0 0 181
steps33 0 1 5.500000e+00 1.840000e+01 0 0 0 0 182
steps34 0 1 5.470000e+00 1.832000e+01 0 0 0 0 180
steps35 0 1 5.420000e+00 1.819000e+01 0 0 0 0 187
steps36 0 1 5.580000e+00 1.870000e+01 0 0 0 0 183
steps37 0 1 5.500000e+00 1.850000e+01 0 0 0 0 181
steps38 0 1 5.480000e+00 1.850000e+01 0 0 0 0 185
steps39 0 1 5.340000e+00 1.806000e+01 0 0 0 0 184
steps40 0 1 5.380000e+00 1.803000e+01 0 0 0 0 184
steps41 0 1 5.340000e+00 1.806000e+01 0 0 0 0 184
steps42 0 1 5.260000e+00 1.802000e+01 0 0 0 0 180
steps43 0 1 5.290000e+00 1.784000e+01 0 0 0 0 188
steps44 0 1 5.350000e+00 1.799000e+01 0 0 0 0 220
steps45 0 1 5.240000e+00 1.786000e+01 0 0 0 0 184
steps46 0 1 5.340000e+00 1.809000e+01 0 0 0 0 207
steps47 0 1 5.300000e+00 1.794000e+01 0 0 0 0 190
steps48 0 1 5.320000e+00 1.780000e+01 0 0 0 0 182
steps49 0 1 5.350000e+00 1.795000e+01 0 0 0 0 182
steps50 0 1 5.330000e+00 1.787000e+01 0 0 0 0 182
steps51 0 1 5.190000e+00 1.760000e+01 0 0 0 0 181
steps52 0 1 5.230000e+00 1.762000e+01 0 0 0 0 181
steps53 0 1 5.150000e+00 1.757000e+01 0 0 0 0 181
steps54 0 1 5.220000e+00 1.768000e+01 0 0 0 0 184
steps55 0 1 5.280000e+00 1.783000e+01 0 0 0 0 181
steps56 0 1 5.180000e+00 1.757000e+01 0 0 0 0 182
steps57 0 1 5.250000e+00 1.769000e+01 0 0 0 0 182
steps58 0 1 5.140000e+00 1.743000e+01 0 0 0 0 180
steps59 0 1 5.290000e+00 1.772000e+01 0 0 0 0 189
sleep_day_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 413
Number of columns 6
_______________________
Column type frequency:
character 2
numeric 4
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 1 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1 5.000979e+09 2.06036e+09 1503960366 3977333714 4702921684 6962181067 8792009665
total_sleep_records 0 1 1.120000e+00 3.50000e-01 1 1 1 1 3
total_minutes_asleep 0 1 4.194700e+02 1.18340e+02 58 361 433 490 796
total_time_in_bed 0 1 4.586400e+02 1.27100e+02 61 403 463 526 961
weight_log_mar_apr_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 33
Number of columns 9
_______________________
Column type frequency:
character 2
logical 1
numeric 6
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 14 0
activity_time 0 1 8 8 0 11 0

Variable type: logical

skim_variable n_missing complete_rate mean count
is_manual_report 0 1 0.7 TRU: 23, FAL: 10

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1.00 6.477156e+09 2.308888e+09 1.503960e+09 4.702922e+09 6.962181e+09 8.877689e+09 8.877689e+09
weight_kg 0 1.00 7.344000e+01 1.653000e+01 5.330000e+01 6.170000e+01 6.250000e+01 8.580000e+01 1.296000e+02
weight_pounds 0 1.00 1.619100e+02 3.644000e+01 1.175100e+02 1.360300e+02 1.377900e+02 1.891600e+02 2.857200e+02
fat 31 0.06 1.600000e+01 8.490000e+00 1.000000e+01 1.300000e+01 1.600000e+01 1.900000e+01 2.200000e+01
bmi 0 1.00 2.573000e+01 4.330000e+00 2.145000e+01 2.410000e+01 2.439000e+01 2.576000e+01 4.617000e+01
log_id 0 1.00 1.459959e+12 3.088072e+08 1.459382e+12 1.459753e+12 1.459987e+12 1.460160e+12 1.460506e+12
weight_log_apr_may_cleaned %>% skim_without_charts()
Data summary
Name Piped data
Number of rows 67
Number of columns 9
_______________________
Column type frequency:
character 2
logical 1
numeric 6
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
activity_date 0 1 10 10 0 31 0
activity_time 0 1 8 8 0 26 0

Variable type: logical

skim_variable n_missing complete_rate mean count
is_manual_report 0 1 0.61 TRU: 41, FAL: 26

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
id 0 1.00 7.009282e+09 1.950322e+09 1.503960e+09 6.962181e+09 6.962181e+09 8.877689e+09 8.877689e+09
weight_kg 0 1.00 7.204000e+01 1.392000e+01 5.260000e+01 6.140000e+01 6.250000e+01 8.505000e+01 1.335000e+02
weight_pounds 0 1.00 1.588100e+02 3.070000e+01 1.159600e+02 1.353600e+02 1.377900e+02 1.875000e+02 2.943200e+02
fat 65 0.03 2.350000e+01 2.120000e+00 2.200000e+01 2.275000e+01 2.350000e+01 2.425000e+01 2.500000e+01
bmi 0 1.00 2.519000e+01 3.070000e+00 2.145000e+01 2.396000e+01 2.439000e+01 2.556000e+01 4.754000e+01
log_id 0 1.00 1.461772e+12 7.829948e+08 1.460444e+12 1.461079e+12 1.461802e+12 1.462375e+12 1.463098e+12
weight_log_mar_apr_cleaned <- weight_log_mar_apr_cleaned %>% 
  select(-fat)

weight_log_apr_may_cleaned <- weight_log_apr_may_cleaned %>%
  select(-fat)
write.csv(daily_activity_mar_apr_cleaned,
          file = "daily_activity_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(daily_activity_apr_may_cleaned,
          file = "daily_activity_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(daily_calories_apr_may_cleaned,
          file = "daily_calories_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(daily_intensities_apr_may_cleaned,
          file = "daily_intensities_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(daily_steps_apr_may_cleaned,
          file = "daily_steps_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(heart_rate_mar_apr_cleaned,
          file = "heart_rate_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(heart_rate_apr_may_cleaned,
          file = "heart_rate_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(hourly_calories_mar_apr_cleaned,
          file = "hourly_calories_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(hourly_calories_apr_may_cleaned,
          file = "hourly_calories_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(hourly_intensities_mar_apr_cleaned,
          file = "hourly_intensities_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(hourly_intensities_apr_may_cleaned,
          file = "hourly_intensities_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(hourly_steps_mar_apr_cleaned,
          file = "hourly_steps_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(hourly_steps_apr_may_cleaned,
          file = "hourly_steps_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(minute_calories_mar_apr_cleaned,
          file = "minute_calories_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(minute_calories_apr_may_cleaned,
          file = "minute_calories_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(minute_calories_apr_may_2_cleaned,
          file = "minute_calories_apr_may_2_cleaned.csv",
          row.names = FALSE)

write.csv(minute_intensities_mar_apr_cleaned,
          file = "minute_intensities_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(minute_intensities_apr_may_cleaned,
          file = "minute_intensities_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(minute_intensities_apr_may_2_cleaned,
          file = "minute_intensities_apr_may_2_cleaned.csv",
          row.names = FALSE)

write.csv(minute_mets_mar_apr_cleaned,
          file = "minute_mets_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(minute_mets_apr_may_cleaned,
          file = "minute_mets_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(minute_sleep_mar_apr_cleaned,
          file = "minute_sleep_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(minute_sleep_apr_may_cleaned,
          file = "minute_sleep_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(minute_steps_mar_apr_cleaned,
          file = "minute_steps_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(minute_steps_apr_may_cleaned,
          file = "minute_steps_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(minute_steps_apr_may_2_cleaned,
          file = "minute_steps_apr_may_2_cleaned.csv",
          row.names = FALSE)

write.csv(sleep_day_apr_may_cleaned,
          file = "sleep_day_apr_may_cleaned.csv",
          row.names = FALSE)

write.csv(weight_log_mar_apr_cleaned,
          file = "weight_log_mar_apr_cleaned.csv",
          row.names = FALSE)

write.csv(weight_log_apr_may_cleaned,
          file = "weight_log_apr_may_cleaned.csv",
          row.names = FALSE)

4. Analyze:

General Aggregation:

Daily total steps – 7281 steps

Daily total distance – 5.219 km = 3.24 miles

Daily lightly active minutes – 185.4 minutes

Daily fairly active minutes - 13.4 minutes

Daily very active minutes – 19.68 minutes

Daily sedentary minutes – 992.5 minutes -> 16.54 hours

Daily calories burned – 2266 calories

Daily heart rate – 78 (resting)

Daily asleep minutes – 419 minutes -> 6.98 hours

Daily time in bed - 458 minutes -> 7.63 hours

Daily weight log – 72.5 kg, 159.8 lbs, 25.37 bmi

Hourly calories – 97.39 calories/hr

Hourly intensities – 11.4 intensity minutes / hr  (mins spent in intensity)

Hourly steps – 302 steps/hr

Minute calories – 1.596 calories/min

Minute intensities - 0.19 intensity minutes / min

Minute metabolic equivalents (METs) – 14.45 mets/min

Minute sleep – 1 sleep stage

Minute steps – 5 steps/min

Daily Trends:

Users are around a daily average of 7281 steps, which is under the recommended steps of CDC’s 8000-10000.

Users travel a daily average total distance of 5.219 km, or 3.24 miles

Users spend more time on a daily average in sedentary minutes (992.5 out of 1210.98 total minutes tracked) 4/5 of the time than in active minutes, (218.48 out of 1210.98 total minutes tracked) 1/5 of the time. (CDC recommends 30 minutes a day)

Credits: Hanna Shuraieva

Users on a daily average meet daily calorie burning standards with 2266 calories burned (2000-3000 for men and 1600-2400 for women, according to CDC).

Users are at a daily average resting heart rate of 78 (resting is 60-100, active is defined as greater than 100 according to the CDC).

Users spend on a daily average 7 ½ hours in bed, 6.98 hours sleeping which barely meets the minimum by CDC of at least 7 hours.

Users are on a daily average of just meeting overweight standards with a 25.37 bmi (where <18.5 is under, 18.5-24.9 is normal, and > 25 is overweight, according to the CDC).

Hourly Trends:

Users on an hourly average burn 97.39 calories, move 302 steps, and spend 11.4 minutes in intense activities (11.4/60) which is close to approximate daily user 1/5 of overall time in active minutes.

Minute Trends:

Users on a minute average burn 1.596 calories, move 5 steps, and spend 0.19 minutes in intense activities (0.19/60) which is close to approximate daily user 1/5 of overall time in active minutes.

Users on a minute average spend more time in sleep stage 1, lighter sleep. According to the CDC, more time should be focused in stage 3 where deep sleep and non-rapid eye movement occurs.

Users on a minute average have a 14.45 metabolic equivalent, which indicates high energy expenditure of intense physical activities (ex. sitting = 1 MET, walking = 3 MET, running = 16 MET, according to PubMed standards).

Trends by Day and Time:

Tuesday was the most consistent active weekday day for user input for daily activity (steps, distance, sedentary, calories), hourly (calories, steps, intensity), and heart rate.

Wednesday was the weekday when most users recorded sleep and weight.

Total user output for hourly calories, intensities, and steps were greatest between 12-2pm, 5-7pm.

Red graphs indicate combined monthly datasets. Sleep_day dataset is bar graphed in blue to indicate that the dataset was not combined nor did it have March to April information.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total count of daily activity days was greatest.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total count of heart rate days was greatest. Followed by Friday and Wednesday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total sum of daily steps was greatest. Followed by Saturday and Wednesday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total sum of daily distance was greatest. Followed by Saturday and Wednesday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total sum of daily sedentary minutes was greatest. Followed by Friday and Wednesday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total sum of daily calories burned was greatest. Followed by Friday and Saturday

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total sum of hourly calories burned was greatest. Followed by Wednesday and Thursday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Saturday, the total sum of hourly intensity was greatest. Followed by Tuesday and Wednesday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Tuesday, the total sum of hourly steps was greatest. Followed by Saturday and Wednesday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Wednesday, the total sum of hourly steps was greatest. Followed by Tuesday and Thursday.

Credits: Hanna Shuraieva and Emi Ly

Observations: On Wednesday, the total sum of hourly steps was greatest. Followed by Monday and Thursday.

Credits: Emi Ly

Observations: Between 12:00-14:00 (12-2pm) and 17:00-19:00 (5-7pm), the total sum of hourly calories burned was greatest.

Credits: Emi Ly

Observations: Between 12:00-14:00 (12-2pm) and 17:00-19:00 (5-7pm), the total sum of hourly intensity minutes was greatest.

Credits: Emi Ly

Observations: Between 12:00-14:00 (12-2pm) and 17:00-19:00 (5-7pm), the total sum of hourly steps was greatest.

Correlation Coefficient Relationships

Correlation coefficients range from -1 to +1. The strength to which these relationships can be interpreted as follows.

Correlation Type Correlation Coefficient
Negative correlation -1
No linear correlation 0
Positive correlation +1
Correlation Strength Coefficient Range
Very strong 0.8 - 1
Strong 0.6 - 0.8
Moderate 0.4 - 0.6
Weak 0.2 - 0.4
Very weak or none 0.0 - 0.2

Exploration of correlation coefficients between different variables in the datasets.

Dataset: daily_activity Coefficient Strength
calories / total_distance 0.635 Strong
calories / sedentary_minutes -0.062 Very weak
calories / total_steps 0.590 Moderate
calories / very_active_minutes 0.582 Moderate
total_steps / sedentary_minutes -0.311 Weak
total_steps / total_distance 0.986 Very strong
total_steps / very active minutes 0.677 Strong

Observations: 0.986 is a very strong correlation between user total distance and steps from the combined daily_activity (daily_activity_mar_apr, daily_activity_apr_may) dataset. As the number of user steps increase, the total distance increases, and the inverse shows a similar relationship.

Dataset: sleep_day Coefficient Strength
total_time_in_bed / total_minutes_asleep 0.930 Very strong

Observations: 0.930 is a very strong correlation between users total minutes asleep and time in bed from the sleep_day_apr_may dataset. As the amount of time users spend in bed increases, the total minutes asleep increases, and the inverse shows a similar relationship.

Dataset: daily_activity + sleep_day Coefficient Strength
calories / total_minutes_asleep -0.036 Very weak
total_distance / total_minutes_asleep -0.176 Very weak
sedentary_minutes / total_minutes_asleep -0.523 Moderate
total_steps / total_minutes_asleep -0.190 Very weak
very_active_minutes / total_minutes_asleep -0.097 Very weak

Observations: -0.523 is a moderate correlation between users total minutes asleep and sedentary minutes from the combined daily_activity (daily_activity_mar_apr, daily_activity_apr_may) and sleep_day_apr_may datasets. There may be some relationship where users have an increasing amount of minutes asleep from increasing their sedentary time, and the inverse may be similar.

Dataset: daily_activity + weight_log Coefficient Strength
calories / weight_pounds 0.533 Moderate
total_distance / weight_pounds 0.235 Weak
sedentary_minutes / weight_pounds 0.468 Moderate
total_steps / weight_pounds 0.120 Very weak
very_active_minutes / weight_pounds 0.280 Weak

Observations: 0.533 is a moderate correlation between users weight in pounds and calories burned from the combined daily_activity (daily_activity_mar_apr, daily_activity_apr_may) and combined weight_log (weight_log_mar_apr, weight_log_apr_may) datasets. There may be some relationship where users have an increasing amount of calories burned if they are in a higher weight in pounds, and the inverse may be similar.

Dataset: daily_calories + daily_intensities + daily_steps Coefficient Strength
calories / sedentary_minutes -0.107 Very weak
calories / total_steps 0.591 Moderate
calories / very_active_minutes 0.616 Strong
total_steps / sedentary_minutes -0.327 Weak
total_steps / very_active_minutes 0.667 Strong

Observations: 0.667 is a strong correlation between users total daily steps and very active minutes from the combined daily_cal_int_ste (daily_calories_apr_may, daily_intensities_apr_may, daily_steps_apr_may) datasets. As users spend more time in very active minutes, the number of total steps increases, and the inverse may be similar.

Dataset: hourly_calories + hourly_intensities + hourly_steps Coefficient Strength
calories / total_intensity 0.897 Very strong
calories / total_steps 0.808 Very strong
total_intensity / total_steps 0.892 Very strong

Observations: 0.892 is a very strong correlation between users total calories burned and intensity from the combined hourly_cal_int_ste (hourly_calories_mar_apr, hourly_calories_apr_may, hourly_intensities_mar_apr hourly_intensities_apr_may, hourly_steps_mar_apr, hourly_steps_apr_may) datasets. As the number of user intensity minutes increases, the total calories burned increases, and the inverse shows a similar relationship.

Dataset: minute_calories + minute_intensities + minute_mets + minute_steps + heart_rate Coefficient Strength
calories / total_intensity 0.894 Very strong
calories / mets 0.954 Very strong
calories / total_steps 0.824 Very strong
calories / heart_rate 0.733 Strong
heart_rate / total_intensity 0.704 Strong
heart_rate / mets 0.768 Strong
heart_rate / total_steps 0.62 Strong
total_intensity / mets 0.940 Very strong
total_intensity / total_steps 0.808 Very strong
mets / total_steps 0.884 Very strong

Observations: Minute datasets contain millions of data points. Creates a tool limitation where it is not as feasible to graph without appropriate technology in place. Most relationships from the minute datasets are strong to very strong correlations ranging from 0.62 to 0.954 which can potentially be reliable trends to further explore.

Correlation Interpretations:

Organizing correlation results into specific categories based on strength and date time measurement of data. (daily, hour, or minute).

Weak to very weak correlation (0.0 - 0.4) that need significantly more data to support claims.

calories / sedentary_minutes (daily)

total_steps / sedentary_minutes (daily)

calories / total_minutes_asleep (daily)

total_distance / total_minutes_asleep (daily)

total_steps / total_minutes_asleep (daily)

very_active_minutes / total_minutes_asleep (daily)

total_distance / weight_pounds (daily)

total_steps / weight_pounds (daily)

very_active_minutes / weight_pounds (daily)

calories/ sedentary_minutes (daily)

total_steps / very_active minutes (daily)

Moderate correlation (0.4 - 0.6) could be potential trends to take advantage of.

calories / total_steps (daily)

calories / very_active_minutes (daily)

sedentary minutes / total_minutes_asleep (daily)

calories / weight_pounds (daily)

sedentary minutes / weight pounds (daily)

Strong correlation (0.6 - 0.8) trends are a worthy recommendation.

calories / total_distance (daily)

total_steps / very_active_minutes (daily)

calories / very_active_minutes (daily)

calories / heart_rate (minute)

heart_rate / total_intensity (minute)

heart_rate / mets (minute)

heart_rate / total_steps (minute)

Very strong correlation (0.8 - 1) are the most reliable trends to capture.

total_steps / total_distance (daily)

total_time_in_bed / total_minutes_asleep (daily)

calories / total_intensity (hourly)

calories / total_steps (hourly)

total_intensity / total_steps (hourly)

calories / total_intensity (minute)

calories / mets (minute)

calories / total_steps (minute)

total_intensity / mets (minute)

total_intensity / total_steps (minute)

mets / total_steps ( minute)

Setting up my environment

Setting up my R environment by loading ‘tidyverse’, ‘here’, ‘skimr’ and ‘janitor’ packages.

Analysis of Data

Analysis of cleaned datasets for trends and relationships.

daily_activity_mar_apr <- read.csv("daily_activity_mar_apr_cleaned.csv")

daily_activity_apr_may <- read.csv("daily_activity_apr_may_cleaned.csv")

daily_calories_apr_may <- read.csv("daily_calories_apr_may_cleaned.csv")

daily_intensities_apr_may <- read.csv("daily_intensities_apr_may_cleaned.csv")

daily_steps_apr_may <- read.csv("daily_steps_apr_may_cleaned.csv")

heart_rate_mar_apr <- read.csv("heart_rate_mar_apr_cleaned.csv")

heart_rate_apr_may <- read.csv("heart_rate_apr_may_cleaned.csv")

hourly_calories_mar_apr <- read.csv("hourly_calories_apr_may_cleaned.csv")

hourly_calories_apr_may <- read.csv("hourly_calories_apr_may_cleaned.csv")

hourly_intensities_mar_apr <- read.csv("hourly_intensities_mar_apr_cleaned.csv")

hourly_intensities_apr_may <- read.csv("hourly_intensities_apr_may_cleaned.csv")

hourly_steps_mar_apr <- read.csv("hourly_steps_mar_apr_cleaned.csv")

hourly_steps_apr_may <- read.csv("hourly_steps_apr_may_cleaned.csv")

minute_calories_mar_apr <- read.csv("minute_calories_mar_apr_cleaned.csv")

minute_calories_apr_may <- read.csv("minute_calories_apr_may_cleaned.csv")

minute_intensities_mar_apr <- read.csv("minute_intensities_mar_apr_cleaned.csv")

minute_intensities_apr_may <- read.csv("minute_intensities_apr_may_cleaned.csv")

minute_mets_mar_apr <- read.csv("minute_mets_mar_apr_cleaned.csv")

minute_mets_apr_may <- read.csv("minute_mets_apr_may_cleaned.csv")

minute_sleep_mar_apr <- read.csv("minute_sleep_mar_apr_cleaned.csv")

minute_sleep_apr_may <- read.csv("minute_sleep_apr_may_cleaned.csv")

minute_steps_mar_apr <- read.csv("minute_steps_mar_apr_cleaned.csv")

minute_steps_apr_may <- read.csv("minute_steps_apr_may_cleaned.csv")

sleep_day_apr_may <- read.csv("sleep_day_apr_may_cleaned.csv")

weight_log_mar_apr <- read.csv("weight_log_mar_apr_cleaned.csv")

weight_log_apr_may <- read.csv("weight_log_apr_may_cleaned.csv")
daily_activity <- rbind(daily_activity_mar_apr, daily_activity_apr_may)

heart_rate <- rbind(heart_rate_mar_apr, heart_rate_apr_may)

hourly_calories <- rbind(hourly_calories_mar_apr, hourly_calories_apr_may)

hourly_intensities <- rbind(hourly_intensities_mar_apr, hourly_intensities_apr_may)

hourly_steps <- rbind(hourly_steps_mar_apr, hourly_steps_apr_may)

minute_calories <- rbind(minute_calories_mar_apr, minute_calories_apr_may)

minute_intensities <- rbind(minute_intensities_mar_apr, minute_intensities_apr_may)

minute_mets <- rbind(minute_mets_mar_apr, minute_mets_apr_may)

minute_sleep <- rbind(minute_sleep_mar_apr, minute_sleep_apr_may)

minute_steps <- rbind(minute_steps_mar_apr, minute_steps_apr_may)

weight_log <- rbind(weight_log_mar_apr, weight_log_apr_may)
library("plotrix")

sedentary_minutes <- sum(daily_activity$sedentary_minutes)

lightly_active_minutes <- sum(daily_activity$lightly_active_minutes)

fairly_active_minutes <- sum(daily_activity$fairly_active_minutes)

very_active_minutes <- sum(daily_activity$very_active_minutes)

daily_activity_minutes <- data.frame(
  activity = c("sedentary", "lightly_active", "fairly_active", "very_active"),
  activity_minutes = c(sedentary_minutes, lightly_active_minutes, fairly_active_minutes, very_active_minutes))

daily_activity_minutes$percentage <- daily_activity_minutes$activity_minutes / sum(daily_activity_minutes$activity_minutes) * 100
daily_activity_minutes$percentage <- round(daily_activity_minutes$percentage*1)

pie3D(daily_activity_minutes$percentage,
      labels = paste0(daily_activity_minutes$percentage,  "%"),
      main = "Percentage of Active Minutes by Daily Activity",
      col = c("skyblue", "lightgreen", "lightcoral", "lightyellow"),
      border = "black",
      labelcex = 0.9)
legend("topright", daily_activity_minutes$activity, cex = 0.8,
       fill = c("skyblue", "lightgreen", "lightcoral", "lightyellow"))

daily_activity$weekdays <- format(as.Date(daily_activity$activity_date, format = "%Y-%m-%d"), format = "%A")

heart_rate$weekdays <- format(as.Date(heart_rate$activity_date, format = "%Y-%m-%d"), format = "%A")

weight_log$weekdays <- format(as.Date(weight_log$activity_date, format = "%Y-%m-%d"), format = "%A")

hourly_calories$weekdays <- format(as.Date(hourly_calories$activity_date, format = "%Y-%m-%d"), format = "%A")

hourly_intensities$weekdays <- format(as.Date(hourly_intensities$activity_date, format = "%Y-%m-%d"), format = "%A")

hourly_steps$weekdays <- format(as.Date(hourly_steps$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_calories$weekdays <- format(as.Date(minute_calories$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_intensities$weekdays <- format(as.Date(minute_intensities$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_mets$weekdays <- format(as.Date(minute_mets$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_sleep$weekdays <- format(as.Date(minute_sleep$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_steps$weekdays <- format(as.Date(minute_steps$activity_date, format = "%Y-%m-%d"), format = "%A")

daily_activity_apr_may$weekdays <- format(as.Date(daily_activity_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

daily_calories_apr_may$weekdays <- format(as.Date(daily_calories_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

daily_intensities_apr_may$weekdays <- format(as.Date(daily_intensities_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

daily_steps_apr_may$weekdays <- format(as.Date(daily_steps_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

heart_rate_apr_may$weekdays <- format(as.Date(heart_rate_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

hourly_calories_apr_may$weekdays <- format(as.Date(hourly_calories_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

hourly_intensities_apr_may$weekdays <- format(as.Date(hourly_intensities_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

hourly_steps_apr_may$weekdays <- format(as.Date(hourly_steps_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_calories_apr_may$weekdays <- format(as.Date(minute_calories_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_intensities_apr_may$weekdays <- format(as.Date(minute_intensities_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_mets_apr_may$weekdays <- format(as.Date(minute_mets_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_sleep_apr_may$weekdays <- format(as.Date(minute_sleep_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

minute_steps_apr_may$weekdays <- format(as.Date(minute_steps_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

sleep_day_apr_may$weekdays <- format(as.Date(sleep_day_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")

weight_log_apr_may$weekdays <- format(as.Date(weight_log_apr_may$activity_date, format = "%Y-%m-%d"), format = "%A")
ggplot(data = daily_activity) +
  geom_bar(fill = "darkred", mapping = aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")))) +
  labs(title = "Total Daily Activity Days Tracked During The Week", x = "Week Day", y = "Total Days")

ggplot(data = heart_rate) +
  geom_bar(fill = "darkred", mapping = aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")))) +
  labs(title = "Total Heart Rate Days Tracked During The Week", x = "Week Day", y = "Total Days")

ggplot(data = sleep_day_apr_may) +
  geom_bar(fill = "steelblue", mapping = aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")))) +
  labs(title = "Total Sleep Days Tracked During The Week", x = "Week Day", y = "Total Days")

ggplot(data = weight_log) +
  geom_bar(fill = "darkred", mapping = aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")))) +
  labs(title = "Total Weight Log Days Tracked During The Week", x = "Week Day", y = "Total Days")

ggplot(data = daily_activity, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = total_steps)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Steps Daily During The Week", x = "Week Day", y = "Total Steps")

ggplot(data = daily_activity, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = total_distance)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Distance Daily During The Week", x = "Week Day", y = "Total Distance")

ggplot(data = daily_activity, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = sedentary_minutes)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Sedentary Minutes Daily During The Week", x = "Week Day", y = "Sedentary Minutes")

ggplot(data = daily_activity, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = calories)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Calories Burned Daily During The Week", x = "Week Day", y = "Calories")

ggplot(data = hourly_calories, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = calories)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Hourly Calories Burned Daily", x = "Week Day", y = "Calories Burned")

ggplot(data = hourly_steps, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = total_steps)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Steps Hourly", x = "Week Day", y = "Total Steps")

ggplot(data = hourly_intensities, aes(x = factor(weekdays, levels = c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")), y = total_intensity)) +
  geom_bar(stat = "identity", fill = "darkred") +
  theme(plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Intensity Hourly", x = "Week Day", y = "Total Intensity")

ggplot(data = hourly_calories, aes(x = activity_time, y = calories, fill = activity_time)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Total Hourly Calories Burned By Hour", x = "Hour", y = "Calories Burned")

ggplot(data = hourly_steps, aes(x = activity_time, y = total_steps, fill = activity_time)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Totals Hourly Steps By Hour", x = "Hour", y = "Total Steps")

ggplot(data = hourly_intensities, aes(x = activity_time, y = total_intensity, fill = activity_time)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        plot.title = element_text(size = 16, hjust = 0.5)) +
  labs(title = "Totals Hourly Intensities By Hour", x = "Hour", y = "Total Intensity")

daily_activity_sleep_merged <- merge(daily_activity, sleep_day_apr_may, by = c("id", "activity_date"))

daily_activity_weight_merged <- merge(daily_activity, weight_log, by = c("id", "activity_date"))

daily_cal_int_merged <- merge(daily_calories_apr_may, daily_intensities_apr_may, by = c("id", "activity_date"))

daily_cal_int_ste_merged <- merge(daily_cal_int_merged, daily_steps_apr_may, by = c("id", "activity_date"))

sleep_weight_merged <- merge(sleep_day_apr_may, weight_log, by = c("id", "activity_date"))

hourly_calories_steps_merged <- merge(hourly_calories, hourly_steps, by = c("id", "activity_date", "activity_time"))

hourly_calories_steps_intensities_merged <- merge(hourly_calories_steps_merged, hourly_intensities, by = c("id", "activity_date", "activity_time"))  

minute_cal_int <- merge(minute_calories, minute_intensities, by = c("id", "activity_date", "activity_time"))

minute_cal_int_met <- merge(minute_cal_int, minute_mets, by = c("id", "activity_date", "activity_time"))

minute_cal_int_met_ste <- merge(minute_cal_int_met, minute_steps, by = c("id", "activity_date", "activity_time"))
## Warning in merge.data.frame(minute_cal_int_met, minute_steps, by = c("id", :
## column names 'weekdays.x', 'weekdays.y' are duplicated in the result
heart_rate_minute_merged <- merge(heart_rate, minute_cal_int_met_ste, by = c("id", "activity_date", "activity_time"))

graphing scatterplots and best fit trends of variables: ggplot(data = daily_activity, aes(x = total_steps, y = total_distance)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Daily Total Distance Based On Total Steps”)

ggplot(data = sleep_day_apr_may, aes(x = total_time_in_bed, y = total_minutes_asleep)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Daily Total Minutes Asleep Based On Total Time In Bed”)

ggplot(data = daily_activity_sleep_merged, aes(x = sedentary_minutes, y = total_minutes_asleep)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Daily Total Minutes Asleep Based On Total Sedentary Minutes”)

ggplot(data = daily_activity_weight_merged, aes(x = calories, y = weight_pounds)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Daily Weight Based On Total Calories Burned”)

ggplot(data = daily_cal_int_ste_merged, aes(x = very_active_minutes, y = total_steps)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Daily Total Steps Based On Total Very Active Minutes”)

ggplot(data = hourly_calories_steps_intensities_merged, aes(x = total_intensity, y = calories)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Hourly Calories Burned Based On Total Intensity”)

ggplot(data = minute_cal_int_met_ste, aes(x = mets, y = calories)) + geom_point() + geom_smooth(method = “auto”, col = “blue”) + theme(plot.title = element_text(size = 16, hjust = 0.5)) + labs(title = “Minute Calories Burned Based On Metabolic Equivalent Minutes”)

r regression analysis, correlations between variables: cor(daily_activity\(calories, daily_activity\)very_active_minutes)

cor(daily_activity\(calories, daily_activity\)sedentary_minutes)

cor(daily_activity\(calories, daily_activity\)total_distance)

cor(daily_activity\(total_steps, daily_activity\)calories)

cor(daily_activity\(total_steps, daily_activity\)total_distance)

cor(daily_activity\(total_steps, daily_activity\)very_active_minutes)

cor(daily_activity\(total_steps, daily_activity\)sedentary_minutes)

cor(daily_activity_sleep_merged\(total_steps, daily_activity_sleep_merged\)total_minutes_asleep)

cor(daily_activity_sleep_merged\(total_distance, daily_activity_sleep_merged\)total_minutes_asleep)

cor(daily_activity_sleep_merged\(sedentary_minutes, daily_activity_sleep_merged\)total_minutes_asleep)

cor(daily_activity_sleep_merged\(calories, daily_activity_sleep_merged\)total_minutes_asleep)

cor(daily_activity_sleep_merged\(very_active_minutes, daily_activity_sleep_merged\)total_minutes_asleep)

cor(daily_activity_weight_merged\(total_steps, daily_activity_weight_merged\)weight_pounds)

cor(daily_activity_weight_merged\(total_distance, daily_activity_weight_merged\)weight_pounds)

cor(daily_activity_weight_merged\(sedentary_minutes, daily_activity_weight_merged\)weight_pounds)

cor(daily_activity_weight_merged\(calories, daily_activity_weight_merged\)weight_pounds)

cor(daily_activity_weight_merged\(very_active_minutes, daily_activity_weight_merged\)weight_pounds)

cor(sleep_day_apr_may\(total_time_in_bed, sleep_day_apr_may\)total_minutes_asleep)

cor(sleep_weight_merged\(total_minutes_asleep, sleep_weight_merged\)weight_pounds)

cor(sleep_weight_merged\(total_time_in_bed, sleep_weight_merged\)weight_pounds)

cor(sleep_weight_merged\(total_minutes_asleep, sleep_weight_merged\)weight_pounds)

cor(daily_cal_int_ste_merged\(total_steps, daily_cal_int_ste_merged\)calories)

cor(daily_cal_int_ste_merged\(total_steps, daily_cal_int_ste_merged\)sedentary_minutes)

cor(daily_cal_int_ste_merged\(total_steps, daily_cal_int_ste_merged\)very_active_minutes)

cor(daily_cal_int_ste_merged\(calories, daily_cal_int_ste_merged\)sedentary_minutes)

cor(daily_cal_int_ste_merged\(calories, daily_cal_int_ste_merged\)very_active_minutes)

cor(hourly_calories_steps_intensities_merged\(total_steps, hourly_calories_steps_intensities_merged\)calories)

cor(hourly_calories_steps_intensities_merged\(total_steps, hourly_calories_steps_intensities_merged\)total_intensity)

cor(hourly_calories_steps_intensities_merged\(calories, hourly_calories_steps_intensities_merged\)total_intensity)

cor(minute_cal_int_met_ste\(total_steps, minute_cal_int_met_ste\)calories)

cor(minute_cal_int_met_ste\(total_steps, minute_cal_int_met_ste\)intensity)

cor(minute_cal_int_met_ste\(total_steps, minute_cal_int_met_ste\)mets)

cor(minute_cal_int_met_ste\(intensity, minute_cal_int_met_ste\)calories)

cor(minute_cal_int_met_ste\(intensity, minute_cal_int_met_ste\)mets)

cor(minute_cal_int_met_ste\(mets, minute_cal_int_met_ste\)calories)

cor(heart_rate_minute_merged\(heart_rate, heart_rate_minute_merged\)calories)

cor(heart_rate_minute_merged\(heart_rate, heart_rate_minute_merged\)intensity)

cor(heart_rate_minute_merged\(heart_rate, heart_rate_minute_merged\)mets)

cor(heart_rate_minute_merged\(heart_rate, heart_rate_minute_merged\)total_steps)

5. Share:

Dashboard:

6. Act:

What are some trends in smart device usage? How could these trends apply to Bellabeat customers?

Smart devices show a significant emphasis on tracking metrics based on number of observations that involve being active including heart rate, steps, distance, calories burned, energy expenditure (METs), and intensity. Less emphasis on non-active metrics like sleep and weight tracking.

Under CDC daily recommendations, FitBit users on average successfully met the target daily goals for calories burned (2000-3000 men, 1600-2400 women calories) and active minutes (>=30 minutes). However, users on average fell short of meeting the total steps (8000-10000 steps), heart rate (>=100 value), sleep (>=7 hours) and weight criteria (<18.5 under, 18.5-24.9 normal, > 25 overweight bmi).

Users spend 18% of their daily time active and 82% sedentary. Approximately 1/5 active, and 4/5 sedentary.

Users on average are overweight with a BMI of 25.37, which may not represent the sample size as that is 0.37 bmi off of being normal under CDC recommendations.

Users spend on a daily average 7 ½ hours in bed, 6.98 hours sleeping which barely meets the minimum by CDC of at least 7 hours. Users on a minute average spend more time in sleep stage 1, lighter sleep. According to the CDC, more time should be focused in stage 3 where deep sleep and non-rapid eye movement occurs.

Users show the greatest overall input for active metrics on Tuesdays. For passive metrics like sleep and weight, Wednesdays shows the most input.

Users show the greatest total hourly input in active metrics like calories burned, intensity, and steps between the hours of 12-2PM and 5-7PM.

If we look at specific trends on a daily, hourly, and minute input basis to understand a user’s daily activity, sleep and stress, then looking at variables that have correlation coefficients with a very strong relationship would be most recommended. Tracking these specific fields on smart devices would be to the immediate benefit of Bellabeat users and can be reliably explored further.

Very strong correlation (0.8 - 1.0)

total_steps / total_distance (daily) – 0.986

total_time_in_bed / total_minutes_asleep (daily) – 0.930

calories / total_intensity (hourly) – 0.897

calories / total_steps (hourly) – 0.808

total_intensity / total_steps (hourly) – 0.892

calories / total_intensity (minute) – 0.894

calories / mets (minute) – 0.954

calories / total_steps (minute) – 0.824

total_intensity / mets (minute) – 0.940

total_intensity / total_steps (minute) – 0.808

mets / total_steps (minute) – 0.884

How could these trends help influence Bellabeat marketing strategy?

Campaign with a health organization like the CDC for educating users on what daily metrics can contribute to a healthy, active lifestyle and encourage them to set goals on the Bellabeat app.

Smart device active metrics are passively tracked on a frequent basis. Non-active metrics like sleep and weight could have a notification on a user’s device or an incentive on the Bellabeat app for users to manually input data more frequently.

Target middle of the week (Tuesday/Wednesday) and the hours between 12-2pm, 5-7pm for user notifications and incentives on the Bellabeat app.

Users should be able to see a more complete daily end of the day total that is a passive sum of hourly or minute metrics. Daily active metrics on the Bellabeat app need to be tracked more consistently on top of existing hourly and minute metrics to establish a better understanding of the relationship between daily active metrics and daily passive metrics such as sleep and weight.